The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,860 other subscribers

Archive for the ‘InternetArchive’ Category

Forgot where I found it, but for posterity: bitnet-links-Bitnet-Network-Definition-verison-89.xlsx

Posted by jpluimers on 2023/12/15

I forgot where I originally downloaded bitnet-links-Bitnet-Network-Definition-verison-89.xlsx from, but for posterity, here it is:

[Wayback] bitnet-links-Bitnet-Network-Definition-verison-89.xlsx

Related blog posts:

–jeroen

Posted in BITNET Relay, Chat, History, Internet, InternetArchive, Power User, SocialMedia | Leave a Comment »

The Wayback Machine Chrome extension got a big update. Every journalist & researcher should install it ASAP! Faster URL archiving w/ customization, access to yr personal archive, and it tells you if the page you’re on has already been archived, etc.

Posted by jpluimers on 2023/12/05

Last year I learned about [Wayback/Archive] Wayback Machine – Chrome Web Store via a Twitter thread starting at

[Wayback/Archive] Craig Silverman on Twitter: “The Wayback Machine Chrome extension got a big update. Every journalist & researcher should install it ASAP! Faster URL archiving w/ customization, access to yr personal archive, and it tells you if the page you’re on has already been archived, etc. #osint”

I saved the full thread at [Wayback/Archive] Thread by @CraigSilverman on Thread Reader App:

Read the rest of this entry »

Posted in Bookmarklet, Chrome, Internet, InternetArchive, LifeHacker, OSINT - Open Source Intelligence, Power User, Uncategorized, WayBack machine, Web Browsers | Leave a Comment »

Avoid VirtualBox; use Hyper-V or VMware in stead

Posted by jpluimers on 2023/11/10

A while ago, Jilles found out why not to use VirtualBox: [Wayback/Archive] Jilles🏳️‍🌈 on Twitter: “@jpluimers Ik wil op basis van wat de Arch community schreeuwt; “Virtualbox is stom, als je geen hyper-v gebruikt vraag je om problemen”, HYPER-V maar gaan proberen.” / Twitter

The biggest problem is that VirtualBox seems to be developed ant tested for the happy path, not the failing path.

Which means that when you use it for less common scenarios, it will often fail in mysterious ways.

Back in Running ArchiveTeam Warrior version 3.2 on ESXi, I already mentioned this:

Totally agreeing with Kristian Kohntopp, I do not understand why people use VirtualBox at all: I just run in too much issues like [Archive.is] Kristian Köhntopp on Twitter: “Hint: Wenn die Installation einer Linux-Distro in Virtualbox mit wechselnden, unbekannten Fehlern scheitert, hilft es, stattdessen einmal VMware Workstation oder kvm zu probieren. In meinem Fall hat es dann *jedes* *einzelne* *Mal* mit *demselben* Iso geklappt.”.

Read the rest of this entry »

Posted in *nix, *nix-tools, ArchiveTeamWarrior, Hyper-V, InternetArchive, Linux, Power User, VirtualBox, Virtualization, VMware, WayBack machine, Windows, Windows 10, Windows 11 | 1 Comment »

Bookmarklet to navigate from a page to the most recent saved WayBack machine entry

Posted by jpluimers on 2023/10/04

A while ago, while writing last weeks post XPath based bookmarklets for Archive.is: more JavaScript fiddling!, I needed the most recent WayBack Machine archival of

https://developer.mozilla.org/en-US/docs/Web/XPath/Introduction_to_using_XPath_in_JavaScript

I vaguely remembered replacing the normal timestamp with a 3 and 13 zeros, so I tried this

https://web.archive.org/web/30000000000000/https://developer.mozilla.org/en-US/docs/Web/XPath/Introduction_to_using_XPath_in_JavaScript

And indeed, it did a HTTP 302 redirect to

https://web.archive.org/web/20220312161117/https://developer.mozilla.org/en-US/docs/Web/XPath/Introduction_to_using_XPath_in_JavaScript

So I quickly made this bookmarklet:

javascript:location.href='https://web.archive.org/web/30000000000000/'+document.location.href;

Then I created another one for getting the screenshot:

javascript:location=location.href.replace(/^https:\/\/web\.archive\.org\/save\/http/,'https://web.archive.org/web/30000000000000/http://web.archive.org/screenshot/http')

That works for screenshots archived with a Wayback Machine account, as these are related because of the inserted http://web.archive.org/screenshot/ fragment:

Since the Wayback Machine always looks for the closest savet timestamp, it does not matter the timestamps in these archived pages have a slight mismatch.

Memory lane

20231006: I edited this section referring two prior blog posts instead of one because of [Wayback/Archive] pbeccard: “@wiert @oliof You can also use…” – Mastodon (clearly showing that Mastodon like any social media platform does mangle backtick quoted code):

@wiert @oliof You can also use `javascript:location.href=’web.archive.org/web/*/’+docume to get the overview. I find this quite useful since I often want an older version of a page.

And later in the reply chain:

[Wayback/Archive] pbeccard: “@wiert @oliof Ah, I thought b…” – Mastodon

@wiert @oliof Ah, I thought by now that maybe Markdown is supported. I pulled the bookmarklet out of my bookmarklet bookmark folder. Here is a copy: https://gist.github.com/corppneq/d61e3…

[Wayback/Archive] Gist: Bookmarklets

I also found back two blog posts:

  1. Need to write a proper bookmarklet for the wayback archive (: mentioning many useful Wayback Machine JavaScript Bookmarklets from my gist [Wayback/Archive] Ideas/inspiration for writing a proper WayBack archive.org bookmarklet including this one:

    [Wayback/Archive] http://www.gyford.com/misc/wayback.html

      • WayBack:

        javascript:location.href='http://web.archive.org/web/*/'+document.location.href;
        

    I also archived this referred page: [Wayback/Archive] Bookmarklets.com – What’s New.

  2. JavaScript bookmarklet to replace part of the WayBack machine URL with a bookmarklet replacing

    JavaScript bookmarklet to replace part of the WayBack machine URL:

    A bookmarklet that goes to the latest rendered saved version (sometimes saved versions have not been rendered yet, so you get the latest available render):

    javascript:location=location.href.replace(/^https:\/\/web\.archive\.org\/save\/http/,'https://web.archive.org/web/30000000000000/http')

    The WayBack Machine uses a 14-position ID and tries to find the render that is the most close by. This is the format of the ID:

    yyyymmddhhmmss

    This is granular enough, as the WayBack machine only allows new saves that are usually 30+ minutes apart.

    (Note that period by now seems to be increased from 30+ minutes to 45+ minutes)

It also found back this post having the same huge number: 0.30000000000000004.com. How cool is WordPress search (:

–jeroen

Posted in Bookmarklet, Development, Internet, InternetArchive, JavaScript/ECMAScript, Power User, Scripting, Software Development, WayBack machine, Web Browsers | Leave a Comment »

Bookmarklet for Archive.is to navigate to the canonical link

Posted by jpluimers on 2023/08/15

This is a follow-up to Bookmarklets for Archive.is and the WayBack Machine to go to the original page.

Archive.is has two kinds of URLs:

  1. The encoded version is the short form without any meta-information,
  2. The canonical version is a long form and has metadata about Archive date and time, and the Archived URL,

You get the first URL both after archiving and when browsing from an archived page to another archived page (if it is not archived you will go the unarchived full page URL).

Read the rest of this entry »

Posted in archive.is / archive.today, Development, Internet, InternetArchive, JavaScript/ECMAScript, Power User, Scripting, Software Development, WayBack machine | Leave a Comment »

DPReview archives: how accessible will they be?

Posted by jpluimers on 2023/04/10

There are various posts indicating part or all of DPreview will be archived:

  1. [Wayback/Archive] DPReview closure: an update: Digital Photography Review
  2. [Wayback/Archive] The Wayback Machine on Twitter: “@jpluimers @geerlingguy @internetarchive We are “on it””
  3. [Wayback/Archive] DPReview – Archiveteam
  4. [Wayback/Archive] Digicam Finder · The most complete and accurate digital camera data source on the internet (1994 — 2023)  which is open source at [Wayback/Archive] open-product-data/digital-cameras: The most complete and accurate digital camera* data on the internet, assembled and maintained by the community. (via [Wayback/Archive] Good news — the camera feature search and all data is saved | Migration | DPRevived)

I wonder how accessible each form of archive will be. The last entry in the above list is very accessible, but only has the camera data (which is a very important aspect, but do not underestimate the forum with millions of posts either).

–jeroen

Posted in ArchiveTeamWarrior, Internet, InternetArchive, Photography, Power User | Leave a Comment »

Working around Archive.is/.today/.ph/.li/.vn/.fo/.md eternal spinner “Loading” when trying to archive a page

Posted by jpluimers on 2023/01/13

I have had the below Archive.is spinner “Loading” without any progress indication on a couple of URLs the last few months and I think they are tied to having special characters in the URL-to-be-archived.

My usual workaround was to first archive in the Wayback Machine, then archive the resulting URL in Archive.is as it would automatically follow the path up to the original URL,

That of course failed when  https://web.archive.org/web/*/vx-underground.org did not want to save in Archive.is: either these would give an eternal spinner on the “Loading” page no matter the browser you were using either the escaped %2A or *:

Read the rest of this entry »

Posted in archive.is / archive.today, Conference Topics, Conferences, Event, Internet, InternetArchive, LifeHacker, Power User, WayBack machine | Leave a Comment »

Interactive @waybackmachine achievement unlocked while manually archiving 4 pages.: HTTP 429 Too Many Requests

Posted by jpluimers on 2022/06/20

[Wayback/Archive] Jeroen Wiert Pluimers on Twitter: “Interactive @waybackmachine achievement unlocked while manually archiving 4 pages. web.archive.org/429.html.

The below error took a few hours to recover from. The submitted URLs were indeed already archived when checking if they were.

It was about the URLs in my blog post earlier today: Vanaf 1 juli kost opheffen oude spaarrekening EUR 75, dus wees er snel bij: Beëindig je oude spaarproduct – ING – Sparen.

I really wish Archive.org had a status page to show system status, as right now you have to guess by pages like below about their status.

You can find the error page at [Archive] https://web.archive.org/429.html (but not all HTTP response codes have pages like this and some respond in a different way like [Archive] https://web.archive.org/404.html).

Read the rest of this entry »

Posted in Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »

Wayback machine and VMware KB links

Posted by jpluimers on 2022/03/22

The VMware KB is notoriously bad into being saved in the WayBack Machine: saved links hardly render at all because of the VMware KB dynamic page loading structure.

But VMware KB articles expire, so a lot of web-pages point to non-existing links and end up through redirections at [Archive.is] https://kb.vmware.com/s/pagenotfound.

Below are a few link forms of the same VMware KB 2011818 article that vanished from the regular web. The first is saved in the WayBack Machine (but does not render), the second is saved and does render after a redirect to a saved third form, the most recent saved fourth form is actually a 404-error redirecting to a prior third form.

  1. https://kb.vmware.com/s/article/2011818
  2. http://kb.vmware.com/kb/2011818
  3. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2011818
  4. http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2011818

The first link form does archive as a rendered page in Archive.is if is is archived. t wasn’t, so the current archived version points to the “pagenotfound” page mentioned above.

Sometimes you have to dig deeper, as not all rendering archived versions contain actual content.

Here the first one is not even archived, the other ones are, but none of them have actual usable content:

  1. https://kb.vmware.com/s/article/2007922
  2. http://kb.vmware.com/kb/2007922
  3. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922
  4. http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922

This means you have to dig further in history:

  1. https://web.archive.org/web/20140123114343/http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922 indicates not authorized
  2. https://web.archive.org/web/20130117041323/http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922 shows the actual content.

–jeroen

Posted in Internet, InternetArchive, link rot, Power User, WayBack machine, WWW - the World Wide Web of information | Leave a Comment »

Digital accessibility is hard; Wayback archival of: Formulieren – CIZ

Posted by jpluimers on 2022/03/17

I know that digital accessibility does not come for free, but it is mandatory in Europe for at least documents and web-sites provided by government and semi-government as per [Wayback] EN 301 549 – Wikipedia

EN 301 549 is a European standard for digital accessibility. It specifies requirements for information and communications technology to be accessible for people with disabilities.

I bumped into numerous tab-order issues when filling out CIZ forms. This makes it way harder for my, as now I require a mouse despite having RSI symptoms for some 30+ years.

So, for my link archive so I can document that all these forms have severe tab-order issues (some fields are not even accessible by keyboard, are being emptied when you leave the field, or not even accessible by mouse): [Wayback] Formulieren – CIZ

Doet u een aanvraag bij het CIZ? Op deze pagina vindt u een overzicht van onze formulieren, zoals een machtigingsformulier en het Wlz-aanvraagformulier.

Hopefully by now the forms have been fixed.

Via:

Read the rest of this entry »

Posted in About, InternetArchive, LifeHacker, Personal, Power User, WayBack machine | Leave a Comment »