The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,861 other subscribers

Archive for the ‘WayBack machine’ Category

Interactive @waybackmachine achievement unlocked while manually archiving 4 pages.: HTTP 429 Too Many Requests

Posted by jpluimers on 2022/06/20

[Wayback/Archive] Jeroen Wiert Pluimers on Twitter: “Interactive @waybackmachine achievement unlocked while manually archiving 4 pages. web.archive.org/429.html.

The below error took a few hours to recover from. The submitted URLs were indeed already archived when checking if they were.

It was about the URLs in my blog post earlier today: Vanaf 1 juli kost opheffen oude spaarrekening EUR 75, dus wees er snel bij: Beëindig je oude spaarproduct – ING – Sparen.

I really wish Archive.org had a status page to show system status, as right now you have to guess by pages like below about their status.

You can find the error page at [Archive] https://web.archive.org/429.html (but not all HTTP response codes have pages like this and some respond in a different way like [Archive] https://web.archive.org/404.html).

Read the rest of this entry »

Posted in Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »

Wayback machine and VMware KB links

Posted by jpluimers on 2022/03/22

The VMware KB is notoriously bad into being saved in the WayBack Machine: saved links hardly render at all because of the VMware KB dynamic page loading structure.

But VMware KB articles expire, so a lot of web-pages point to non-existing links and end up through redirections at [Archive.is] https://kb.vmware.com/s/pagenotfound.

Below are a few link forms of the same VMware KB 2011818 article that vanished from the regular web. The first is saved in the WayBack Machine (but does not render), the second is saved and does render after a redirect to a saved third form, the most recent saved fourth form is actually a 404-error redirecting to a prior third form.

  1. https://kb.vmware.com/s/article/2011818
  2. http://kb.vmware.com/kb/2011818
  3. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2011818
  4. http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2011818

The first link form does archive as a rendered page in Archive.is if is is archived. t wasn’t, so the current archived version points to the “pagenotfound” page mentioned above.

Sometimes you have to dig deeper, as not all rendering archived versions contain actual content.

Here the first one is not even archived, the other ones are, but none of them have actual usable content:

  1. https://kb.vmware.com/s/article/2007922
  2. http://kb.vmware.com/kb/2007922
  3. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922
  4. http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922

This means you have to dig further in history:

  1. https://web.archive.org/web/20140123114343/http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922 indicates not authorized
  2. https://web.archive.org/web/20130117041323/http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922 shows the actual content.

–jeroen

Posted in Internet, InternetArchive, link rot, Power User, WayBack machine, WWW - the World Wide Web of information | Leave a Comment »

Digital accessibility is hard; Wayback archival of: Formulieren – CIZ

Posted by jpluimers on 2022/03/17

I know that digital accessibility does not come for free, but it is mandatory in Europe for at least documents and web-sites provided by government and semi-government as per [Wayback] EN 301 549 – Wikipedia

EN 301 549 is a European standard for digital accessibility. It specifies requirements for information and communications technology to be accessible for people with disabilities.

I bumped into numerous tab-order issues when filling out CIZ forms. This makes it way harder for my, as now I require a mouse despite having RSI symptoms for some 30+ years.

So, for my link archive so I can document that all these forms have severe tab-order issues (some fields are not even accessible by keyboard, are being emptied when you leave the field, or not even accessible by mouse): [Wayback] Formulieren – CIZ

Doet u een aanvraag bij het CIZ? Op deze pagina vindt u een overzicht van onze formulieren, zoals een machtigingsformulier en het Wlz-aanvraagformulier.

Hopefully by now the forms have been fixed.

Via:

Read the rest of this entry »

Posted in About, InternetArchive, LifeHacker, Personal, Power User, WayBack machine | Leave a Comment »

ESXi: some notes on .vswp files; there are actually two types of them!

Posted by jpluimers on 2022/02/23

Earlier this month, I ended ESXi: editing /etc/vmware/hostd/vmInventory.xml to fix the datastore UUID for unavailable VMs part 2 with this:

A final note: I need to check out if .vswp files need to be there at all, as my ESXi servers have plenty of physical memory in order not to swap out to disk. More on that in a future blog post.

Browsing back through my blog posts, I mentioned .vswp files before, but never really dug into them:

Read the rest of this entry »

Posted in ArchiveTeamWarrior, ESXi6, ESXi6.5, ESXi6.7, ESXi7, Internet, InternetArchive, Power User, Virtualization, VMware, VMware ESXi, WayBack machine | Leave a Comment »

When high SEO ranking fails to give you a reliable result: IsItDownRightNow.com failed to detect the WayBack Machine outage

Posted by jpluimers on 2022/02/11

A high SEO ranking does not automatically indicate a reliable result.

When the WayBack Machine was down a while ago (it responded to traceroute UDP requests, but would not establish TCP connections on ports 80 and 443), the first Google hit for detecting down status (searching for [Archive.is] waybackmachine down – Google Search) failed miserably because it redirected web.archive.org (which fails) to http://www.archive.org (which succeeds):

IsIdDownRightNow failing to detect web.archive.org downtime

IsIdDownRightNow failing to detect web.archive.org downtime

Luckily when asking around on Twitter:

  • others were experiencing the same problem, not just in The Netherlands, but also in other countries
  • after trying a few things, the WayBack machine got backup [Archive.is] before I could try cURL.
  • I got pointed at www.uptrends.com/tools/uptime which correctly does check the right subdomain and shows it is down from many locations:

Read the rest of this entry »

Posted in *nix, cURL, Infrastructure, Internet, InternetArchive, LifeHacker, Power User, WayBack machine | Leave a Comment »

ESXi: editing /etc/vmware/hostd/vmInventory.xml to fix the datastore UUID for unavailable VMs part 2

Posted by jpluimers on 2022/02/01

I started my post ESXi: editing /etc/vmware/hostd/vmInventory.xml to fix the datastore UUID for unavailable VMs with

In case I ever need this on ESXi: Insights into the VMware inventory files (vmAutoStart.xml and vmInventory.xml on ESXi; inventory.vmls on VMware Workstation/Player)

Since almost all of my blog is about things I bumped into in real life, this post was a preparation because I kind of expected this to indeed happen, and it did.

Below are the screenshots and steps I took. Of course it is an N=1 experience, so your situation might differ, but I tried to be thorough and not miss any steps.

Read the rest of this entry »

Posted in ArchiveTeamWarrior, ESXi6, ESXi6.5, ESXi6.7, ESXi7, Internet, InternetArchive, Power User, Virtualization, VMware, VMware ESXi, WayBack machine | Leave a Comment »

Happy 20th birthday WayBack machine and thanks Brewster for starting Internet Archive almost 25 years ago

Posted by jpluimers on 2021/10/24

Today, 20 years ago, the Wayback Machine started to unlock the archived content that the Internet Archive had been crawling since 1996 and make it accessible for the public at large.

Thanks Brewster Kahle for making all of this possible for such a long time!

Read the rest of this entry »

Posted in History, Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »

Highly esteemed science: An analysis of attitudes towards and perceived attributes of science in letters to the editor in two Dutch newspapers – Stefan P.L. de Jong, Elena Ketting, Leonie van Drooge, 2020

Posted by jpluimers on 2021/10/06

All my IPv4 addresses seem to be blocked with messages like this (note the odd, but allowed, leading zero in the IPv4 address [WayBack]):

Error

The IP you are accessing the site with (037.153.243.242) has been blocked because it has triggered one of our security measures. Please see the reason below:
Block reason: This IP was identified as infiltrated and is being used by sci-hub as a proxy.
To restore access, please contact onlinesupport@sagepub.com citing this message in full.

A quick [WayBack] “This IP was identified as infiltrated and is being used by sci-hub as a proxy.” – Google Search shows they also block the Google Bot.

I am not not even going to bother with companies that have bad infiltration detection.

Of course I ensured the paper has been archived:

[WayBack/Archive.is] Highly esteemed science: An analysis of attitudes towards and perceived attributes of science in letters to the editor in two Dutch newspapers – Stefan P.L. de Jong, Elena Ketting, Leonie van Drooge, 2020.

Note I do not run sci-hub, though it tempts me doing so. For more info: [WayBack] Sci-Hub – Wikipedia

I checked the router and web-proxy for any suspicious activity. There is none.

I do run the ArchiveBot by the ArchiveTeam to support the WayBackMachine of the InternetArchive and the great team Mark Graham has there providing some bandwidth and CPU/memory resources helping them archive public internet content for posterity.

It that triggers SAGE, too bad for them.

–jeroen

Read the rest of this entry »

Posted in Development, Internet, InternetArchive, LifeHacker, Power User, Software Development, WayBack machine, Web Development | Leave a Comment »

Windows and the current state of S.M.A.R.T. tooling that understands NVMe

Posted by jpluimers on 2021/09/16

I had trouble with two Intel 600p NVMe SSD devices: read-errors.

It appeared only few tools understand how to get S.M.A.R.T. health information from them, and even then they did not explain the read errors.

I’m going to RMA them, but in case anyone else needs to get health information from NVMe SSD devices, here is which tools do what:

So basically, CrystalDiskInfo and HD Tune are my first line of checking for drive issues, followed by smartmontools to get text output, then by vendor specific tools to assist with the RMA.

In the past, I used another smartmontools wrapper, but it was discontinued and had an even older version than GSmartControl: Source: Closed: HDD Guardian – Home.

On Intel 600p becoming locked in read-only mode after failure:

Start of Intel RMA procedure via [Wayback] Warranty Information.

My case looks remarkably similar to [Wayback] Full Diagnostic Scan always fails during Read Scan on my SSD 600p Series 256GB – Intel Community.

A few screenshots of the tools I used for health information:

Read the rest of this entry »

Posted in Hardware, NVMe, Power User, SSD, WayBack machine | Leave a Comment »

Overview of Client Libraries · Internet Archive

Posted by jpluimers on 2021/09/14

Besides manual upload at [Archive.is] Upload to Internet Archive, there are also automated ways of uploading content.

One day I need this to archive pages or sites into the WayBack machine: [WayBack] Overview of Client Libraries · Internet Archive (most of which is Python based):

Read the rest of this entry »

Posted in Bookmarklet, Development, Internet, InternetArchive, Power User, Python, Scripting, Software Development, WayBack machine, Web Browsers | Leave a Comment »