The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 2,392 other followers

Contact for when WayBack internet archival fails to grab content

Posted by jpluimers on 2021/06/07

For my link archive, some tweets. [WayBack] Mark Graham is the person to contact in case archiving a link in the WayBack machine fails.

These are the steps for my link archival:

  1. check if it saves and renders with the WayBack machine, if so, copy the saved URL and the original URL
  2. check if it saves and renders with archive.is, if so, copy the saved URL and the original URL
  3. if neither saved, then use the original URL and link text, but note it was unsavable; otherwise prepend the original URL and link text with [WayBack] or [Archive.is] containing the saved URL

Reporting history gist: https://gist.github.com/jpluimers/6115b3cd6dab568ebd1c10ebddfaf140

–jeroen

 

 

For my search terms, contact for when I cannot archive something in the WayBack machine: [WayBack] Mark Graham (@MarkGraham) | Twitter

  • Mark Graham
  • @MarkGraham
  • Director, the Wayback Machine, at the Internet Archive. Co-founder http://APC.org . Former SVP, NBCUniversal News Digital, Runner, Seeker & Buddhist
  • Half Moon Bay, CA
  • archive.org/web/
  • Joined March 2007
  • Born December 27

Failing WayBack archival URLs

Early June 2019, Mark Graham asked me to share a URL that fails saving in the WayBack machine.

Below is a start of a list, so I can track when reported, and when they could become archived.

Note that not being able to archive URLs can be both a problem of the WayBack machine and of the page/site being archived.

So this document is absolutely not about blame, just a means to assist the WayBack machine and sites to do better archiving.

20190614: rdw.nl

20190614: support.microsoft.com URL

Cookies are disabled

Please enable cookies and refresh the page

20190614: bitbucket URL tries to refresh

20190614: vng.nl URL cannot be archived

20190614: More Dutch government sites cannot be archived

20190615: Chrome webstore URLs get mangled after load

20190622: PacktPub

Later that day, the error went away. Not sure what happened, or what rectified it.

20190622: Archived VMware KB shows empty page

The archived URL shows an empty page.

Same for this one:

20190714: Reddit fais saving

20190722: GForce forums fail saving

Fails both in wayback machine and Archive.org:

  • Archived URL 1 fails while displaying with a timeline and a blank content

  • Archived URL fails while displaying with a constantly refreshing page

  • Archive.is fails while saving with:

     Error: Network error.
    status	type	size		url
    0	GET	https://forums.geforce.com/default/topic/966823/geforce-drivers/displayport-autodetect-override-view-system-topology-on-desktop-cards/post/5190822/
    

view raw
README.md
hosted with ❤ by GitHub

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
%d bloggers like this: