The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,860 other subscribers

Archive for the ‘InternetArchive’ Category

DPReview archives: how accessible will they be?

Posted by jpluimers on 2023/04/10

There are various posts indicating part or all of DPreview will be archived:

  1. [Wayback/Archive] DPReview closure: an update: Digital Photography Review
  2. [Wayback/Archive] The Wayback Machine on Twitter: “@jpluimers @geerlingguy @internetarchive We are “on it””
  3. [Wayback/Archive] DPReview – Archiveteam
  4. [Wayback/Archive] Digicam Finder · The most complete and accurate digital camera data source on the internet (1994 — 2023)  which is open source at [Wayback/Archive] open-product-data/digital-cameras: The most complete and accurate digital camera* data on the internet, assembled and maintained by the community. (via [Wayback/Archive] Good news — the camera feature search and all data is saved | Migration | DPRevived)

I wonder how accessible each form of archive will be. The last entry in the above list is very accessible, but only has the camera data (which is a very important aspect, but do not underestimate the forum with millions of posts either).

–jeroen

Posted in ArchiveTeamWarrior, Internet, InternetArchive, Photography, Power User | Leave a Comment »

Working around Archive.is/.today/.ph/.li/.vn/.fo/.md eternal spinner “Loading” when trying to archive a page

Posted by jpluimers on 2023/01/13

I have had the below Archive.is spinner “Loading” without any progress indication on a couple of URLs the last few months and I think they are tied to having special characters in the URL-to-be-archived.

My usual workaround was to first archive in the Wayback Machine, then archive the resulting URL in Archive.is as it would automatically follow the path up to the original URL,

That of course failed when  https://web.archive.org/web/*/vx-underground.org did not want to save in Archive.is: either these would give an eternal spinner on the “Loading” page no matter the browser you were using either the escaped %2A or *:

Read the rest of this entry »

Posted in archive.is / archive.today, Conference Topics, Conferences, Event, Internet, InternetArchive, LifeHacker, Power User, WayBack machine | Leave a Comment »

Interactive @waybackmachine achievement unlocked while manually archiving 4 pages.: HTTP 429 Too Many Requests

Posted by jpluimers on 2022/06/20

[Wayback/Archive] Jeroen Wiert Pluimers on Twitter: “Interactive @waybackmachine achievement unlocked while manually archiving 4 pages. web.archive.org/429.html.

The below error took a few hours to recover from. The submitted URLs were indeed already archived when checking if they were.

It was about the URLs in my blog post earlier today: Vanaf 1 juli kost opheffen oude spaarrekening EUR 75, dus wees er snel bij: Beëindig je oude spaarproduct – ING – Sparen.

I really wish Archive.org had a status page to show system status, as right now you have to guess by pages like below about their status.

You can find the error page at [Archive] https://web.archive.org/429.html (but not all HTTP response codes have pages like this and some respond in a different way like [Archive] https://web.archive.org/404.html).

Read the rest of this entry »

Posted in Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »

Wayback machine and VMware KB links

Posted by jpluimers on 2022/03/22

The VMware KB is notoriously bad into being saved in the WayBack Machine: saved links hardly render at all because of the VMware KB dynamic page loading structure.

But VMware KB articles expire, so a lot of web-pages point to non-existing links and end up through redirections at [Archive.is] https://kb.vmware.com/s/pagenotfound.

Below are a few link forms of the same VMware KB 2011818 article that vanished from the regular web. The first is saved in the WayBack Machine (but does not render), the second is saved and does render after a redirect to a saved third form, the most recent saved fourth form is actually a 404-error redirecting to a prior third form.

  1. https://kb.vmware.com/s/article/2011818
  2. http://kb.vmware.com/kb/2011818
  3. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2011818
  4. http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2011818

The first link form does archive as a rendered page in Archive.is if is is archived. t wasn’t, so the current archived version points to the “pagenotfound” page mentioned above.

Sometimes you have to dig deeper, as not all rendering archived versions contain actual content.

Here the first one is not even archived, the other ones are, but none of them have actual usable content:

  1. https://kb.vmware.com/s/article/2007922
  2. http://kb.vmware.com/kb/2007922
  3. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922
  4. http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922

This means you have to dig further in history:

  1. https://web.archive.org/web/20140123114343/http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922 indicates not authorized
  2. https://web.archive.org/web/20130117041323/http://kb.vmware.com:80/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007922 shows the actual content.

–jeroen

Posted in Internet, InternetArchive, link rot, Power User, WayBack machine, WWW - the World Wide Web of information | Leave a Comment »

Digital accessibility is hard; Wayback archival of: Formulieren – CIZ

Posted by jpluimers on 2022/03/17

I know that digital accessibility does not come for free, but it is mandatory in Europe for at least documents and web-sites provided by government and semi-government as per [Wayback] EN 301 549 – Wikipedia

EN 301 549 is a European standard for digital accessibility. It specifies requirements for information and communications technology to be accessible for people with disabilities.

I bumped into numerous tab-order issues when filling out CIZ forms. This makes it way harder for my, as now I require a mouse despite having RSI symptoms for some 30+ years.

So, for my link archive so I can document that all these forms have severe tab-order issues (some fields are not even accessible by keyboard, are being emptied when you leave the field, or not even accessible by mouse): [Wayback] Formulieren – CIZ

Doet u een aanvraag bij het CIZ? Op deze pagina vindt u een overzicht van onze formulieren, zoals een machtigingsformulier en het Wlz-aanvraagformulier.

Hopefully by now the forms have been fixed.

Via:

Read the rest of this entry »

Posted in About, InternetArchive, LifeHacker, Personal, Power User, WayBack machine | Leave a Comment »

ESXi: some notes on .vswp files; there are actually two types of them!

Posted by jpluimers on 2022/02/23

Earlier this month, I ended ESXi: editing /etc/vmware/hostd/vmInventory.xml to fix the datastore UUID for unavailable VMs part 2 with this:

A final note: I need to check out if .vswp files need to be there at all, as my ESXi servers have plenty of physical memory in order not to swap out to disk. More on that in a future blog post.

Browsing back through my blog posts, I mentioned .vswp files before, but never really dug into them:

Read the rest of this entry »

Posted in ArchiveTeamWarrior, ESXi6, ESXi6.5, ESXi6.7, ESXi7, Internet, InternetArchive, Power User, Virtualization, VMware, VMware ESXi, WayBack machine | Leave a Comment »

Archive.is is more like a thread unroll service than an archival service

Posted by jpluimers on 2022/02/14

An interesting take a while ago on [Wayback] Archive.is blog — People often compare various features of…

People often compare various features of archive.is to those of archive.org being mistaken by name similarity (and recently added “save a page” function to archive.org).

This project is different in at least two respects:

  1. We have no goal to save the entire Internet. Only manually submitted pages which may be deleted/altered soon. We are about 100x smaller than archive.org in the storage space (700TB vs. 70PB) and expenses (X,000 $/mo vs. X00,000 $/mo).
  2. The pages are not saved in their network form. Archive.today launches real browsers (not even headless) and tries to load lazy images, unroll folded content, login into accounts if prompted with login form, remove “subscribe our maillist” modals, … So archive.today is not suitable for making notarized or digitally signed snapshots.

It would be more correct to compare it with other thread unrollers.

The RSS feed of blog.archive.today is at blog.archive.today/rss

Read the rest of this entry »

Posted in archive.is / archive.today, Bookmarklet, Conference Topics, Conferences, Development, Event, Internet, InternetArchive, JavaScript/ECMAScript, Power User, Scripting, Software Development, Web Browsers | Leave a Comment »

When high SEO ranking fails to give you a reliable result: IsItDownRightNow.com failed to detect the WayBack Machine outage

Posted by jpluimers on 2022/02/11

A high SEO ranking does not automatically indicate a reliable result.

When the WayBack Machine was down a while ago (it responded to traceroute UDP requests, but would not establish TCP connections on ports 80 and 443), the first Google hit for detecting down status (searching for [Archive.is] waybackmachine down – Google Search) failed miserably because it redirected web.archive.org (which fails) to http://www.archive.org (which succeeds):

IsIdDownRightNow failing to detect web.archive.org downtime

IsIdDownRightNow failing to detect web.archive.org downtime

Luckily when asking around on Twitter:

  • others were experiencing the same problem, not just in The Netherlands, but also in other countries
  • after trying a few things, the WayBack machine got backup [Archive.is] before I could try cURL.
  • I got pointed at www.uptrends.com/tools/uptime which correctly does check the right subdomain and shows it is down from many locations:

Read the rest of this entry »

Posted in *nix, cURL, Infrastructure, Internet, InternetArchive, LifeHacker, Power User, WayBack machine | Leave a Comment »

ESXi: editing /etc/vmware/hostd/vmInventory.xml to fix the datastore UUID for unavailable VMs part 2

Posted by jpluimers on 2022/02/01

I started my post ESXi: editing /etc/vmware/hostd/vmInventory.xml to fix the datastore UUID for unavailable VMs with

In case I ever need this on ESXi: Insights into the VMware inventory files (vmAutoStart.xml and vmInventory.xml on ESXi; inventory.vmls on VMware Workstation/Player)

Since almost all of my blog is about things I bumped into in real life, this post was a preparation because I kind of expected this to indeed happen, and it did.

Below are the screenshots and steps I took. Of course it is an N=1 experience, so your situation might differ, but I tried to be thorough and not miss any steps.

Read the rest of this entry »

Posted in ArchiveTeamWarrior, ESXi6, ESXi6.5, ESXi6.7, ESXi7, Internet, InternetArchive, Power User, Virtualization, VMware, VMware ESXi, WayBack machine | Leave a Comment »

ESXi: on the console/ssh, when a moved VM pauses during power-on: show which VMs have messages waiting, then answer them

Posted by jpluimers on 2022/01/27

First the script that display messages for all virtual machines, vim-cmd-display-messages-for-all-VMs.sh:

#!/bin/sh
vmids=`vim-cmd vmsvc/getallvms | sed -n -E -e "s/^([[:digit:]]+)\s+((\S.+\S)?)\s+(\[\S+\])\s+(.+\.vmx)\s+(\S+)\s+(vmx-[[:digit:]]+)\s*?((\S.+)?)$/\1/p"`
for vmid in ${vmids} ; do
    powerState=`vim-cmd vmsvc/power.getstate ${vmid} | sed '1d'`
    name=`vim-cmd vmsvc/get.config ${vmid} | sed -n -E -e '/\(vim.vm.ConfigInfo\) \{/,/files = \(vim.vm.FileInfo\) \{/ s/^ +name = "(.*)",.*?/\1/p'`
    vmPathName=`vim-cmd vmsvc/get.config ${vmid} | sed -n -E -e '/files = \(vim.vm.FileInfo\) \{/,/tools = \(vim.vm.ToolsConfigInfo\) \{/ s/^ +vmPathName = "(.*)",.*?/\1/p'`
    echo "Messages for VM with id ${vmid} which has power state ${powerState} (name = ${name}; vmPathName = ${vmPathName})."
    vim-cmd vmsvc/message ${vmid}
done
exit 0

It is very similar to vim-cmd-reload-all-VM-vmx-configurations.sh from Source: ESXi: reloading all virtual machines from their (potentially) vmx files.

Messages I know either equal “No message” or are about “This virtual machine may have been moved or copied.

If there is no available message, then you always get the stock message No message., so this is something you can use as a check in scripts.

Read the rest of this entry »

Posted in *nix, *nix-tools, ArchiveTeamWarrior, ash/dash, ash/dash development, Development, ESXi6, ESXi6.5, ESXi6.7, ESXi7, Power User, Scripting, Software Development, Virtualization, VMware, VMware ESXi | Leave a Comment »