Archive for the ‘WayBack machine’ Category
Posted by jpluimers on 2019/10/19
Got this a while ago while saving a bunch of links for my blog; unfortunately the email address did not respond for information
Too Many Requests
We are limiting the number of URLs you can submit to be Archived to the Wayback Machine, using the Save Page Now features, to no more than 15 per minute.
If you submit more than that we will block Save Page Now requests from your IP number for one day.
Please feel free to write to us at info@archive.org if you have questions about this. Please include your IP address and any URLs in the email so we can provide you with better service.
I wish there was a queue service that would make you wait longer, but does fulfill the request.
–jeroen
Posted in Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »
Posted by jpluimers on 2019/08/16
When archiving pages in the WayBack machine, despite Privacy Badger having set to “save no cookies”, it still managed to set truckloads of cookies.
So I used the Chrome settings in chrome://settings/content/cookies to disable cookies and now everything is fine.
–jeroen
Read the rest of this entry »
Posted in Chrome, Google, Internet, InternetArchive, Power User, Privacy, WayBack machine | Leave a Comment »
Posted by jpluimers on 2019/05/27
When you get the response “web.archive.org unexpectedly closed the connection” without even returning an HTTP code, but:
- it works in anonymous mode
- it works with all extensions turned off
then likely there are too many cookies for archive.org or/and web.archive.org: in my case, I had 90 cookies.
Cleaning these cookies out resolved the problem (I used [WayBack] Awesome Cookie Manager for this).
Edit 20231230: Awesome Cookie Manager source repository at [Wayback/Archive] Phatsuo/awesome-cookie-manager: Awesome Cookie Manager.

--jeroen
Posted in Chrome, Google, Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »
Posted by jpluimers on 2019/04/18
I still have to do this every few weeks on all my desktop machines: [WayBack] When +Google Nederland maps only fills none or part of the map tiles… – Jeroen Wiert Pluimers – Google+
When +Google Nederland maps only fills none or part of the map tiles at https://maps.google.nl, but https://maps.google.com works fine, then remove any gsScrollPos cookies from www.google.nl.
I need to do this every couple of days to keep maps.google.nl working.

Later I also found it can happen for YouTube, then did more digging for gsScrollPos and found a better workaround: [WayBack] Awesome Cookie Manager where you can just delete the gsScrollPos cookies from all sites in one go.
Even later I found out that this can be one of the causes for the WayBack machine giving an error 400 when archiving. A more common reason however is that many archived web-pages try to create cookies in the web.archive.com subdomain resulting in the same problem.
The cause seems to be the Great Suspender plugin which should be fixed by now, but might not automatically update to the latest version. See:
Pending a new Great Suspender release, below is a quick way to manually remove them if you are into SQL scripting for sqlite. It basically comes down to executing the below statement when Chrome is closed:
delete from cookies where name like 'gsScrollPos-%'
Edit 20231230: Awesome Cookie Manager source repository at [Wayback/Archive] Phatsuo/awesome-cookie-manager: Awesome Cookie Manager.
--jeroen
Posted in Chrome, Google, GoogleMaps, Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »
Posted by jpluimers on 2019/03/18
Soon this is a thing of the past, but for just a few more days, you can help: Archiving Google+.
Either run this project: [WayBack] GitHub – ArchiveTeam/googleplus-grab: Archiving Google+.
Or even better: run the appliance, and help the WayBack machine with any archiving projects setup by the virtual appliance: the [WayBack] ArchiveTeam Warrior – Archiveteam.
See some of their other pages for more background information:
You can donate both to the archive team, and the internet archive:
How is G+ archiving doing?
The tracker is well under way: [WayBack] Googleplus tracker Dashboard. History: archive.is 1; archive.is 2
Read the rest of this entry »
Posted in ArchiveTeamWarrior, Development, G+: GooglePlus, Google, Internet, InternetArchive, Power User, Python, Scripting, SocialMedia, Software Development, WayBack machine | Leave a Comment »
Posted by jpluimers on 2018/10/15
I’ve used these myself:
There are many more listed in for instance these links:
IIPC OpenWayBack:
–jeroen
Posted in Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »
Posted by jpluimers on 2017/12/22
If saving a web-page on the WayBack machine throws you this error on any site:
Wayback Exception
An unknown exception has occurred. Unexpected Error
Then usually the cause is having too many cookies.
Clean your cookies for web.archive.org, then try again.
–jeroen
Posted in Internet, InternetArchive, Power User, WayBack machine | 2 Comments »
Posted by jpluimers on 2017/12/17
In an era where we’ve become dependent on 24/7 communications and availability of the internet, but even more so on archives of information that appeared, became fake and then denied, the Internet Archive (including the WayBack machine) was down for a few hours because of a PGE power outage in San Francisco.
(Posted late because, well the WordPress.com “missed schedule” bug is back)
So this is a reminder to sponsor the Internet Acrhive. Because we can.

–jeroen
Read the rest of this entry »
Posted in Internet, InternetArchive, Missed Schedule, Power User, SocialMedia, WayBack machine, WordPress | Leave a Comment »