The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,860 other subscribers

Archive for the ‘WayBack machine’ Category

Google Search teamed up with the Internet Archive’s Wayback Machine: the good, the bad, the ugly

Posted by jpluimers on 2024/09/16

tTL;DR: Google Search also (after 3+ manual steps) showing the most recent Wayback Machine archived page for a web-page search result, helps tremendously for pages that are temporarily off-line (everyone knows how stable the cloud – someone else’s computers – or on-premise computing is), but takes too many steps and still doesn’t index the full Wayback Machine.

But there is a Clint Eastwood movie title here, even after the devastating fact that Google now off-loads its Google Cache to the Wayback Machine (which many sites refuse to be archived in), as per [Wayback/Archive] Google will no longer back up the Internet: Cached webpages are dead | Ars Technica:

The good

Many posted the links to the big news last week:

Read the rest of this entry »

Posted in Google, GoogleSearch, Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »

Deleted Tweet Finder

Posted by jpluimers on 2024/06/24

[Wayback/Archive] Deleted Tweet Finder taught me there is another web page archival site next to the Wayback Machine and Archive.is (also known as Archive Today): GhostArchive which was established in 2021 right when I was recovering from more than a year of cancer treatments.

They have quite a few ways to address an archived URL of which this is the main entry point: https://ghostarchive.org/search?term=https%3A%2F%2Ftwitter.com%2Fhisvault_eth%2Fstatus%2F1802834724114649422

Reminder to self: figure out the URLs that trigger archival.

Via

Note that the Google Webcache is not really an archival site, nor is there possibility to trigger archival.

The URL structure there is https://webcache.googleusercontent.com/search?q=cache:https%3A%2F%2Ftwitter.com%2Fhisvault_eth%2Fstatus%2F1802834724114649422 (the part after cache: is the page link after URL encoding)

--jeroen

Posted in archive.is / archive.today, Archiving, Internet, InternetArchive, Power User, WayBack machine | Leave a Comment »

Climbing up from a deep pit it just as admirable as climbing a mountain (via Liz Fosslien)

Posted by jpluimers on 2024/05/06

A while ago in 24 hours time, I bumped into both of these great illustrations about accomplishments that help strengthen your mental state.

Time to give the authors a boost:

You can find more about their books and workshops at [Wayback/Archive] Liz + Mollie Feel Things.

The illustrations that triggered me

Read the rest of this entry »

Posted in About, archive.is / archive.today, Awareness, Health, Inclusion / inclusive society, Instagram, Internet, InternetArchive, LifeHacker, Personal, Power User, SocialMedia, Twitter, WayBack machine | Leave a Comment »

etched: permanent, but slow way for storing/retrieving archived web-content

Posted by jpluimers on 2024/03/08

[Wayback/Archive.is] about etched:

etched is an internet archive tool that permanently timestamps and stores web pages directly into the Bitcoin BSV blockchain.

This is a major improvement from traditional web archivers as all etched pages are permanently stored and independently provable by anyone who has access to the bitcoin blockchain. This means even if etched shuts down anyone can search and view all previously saved data using bitcoin browsers like Bottle.

Via [Wayback] Archive.is blog — See if you suddenly died and that hardware failure…:

For redundancy, try something like etched.page, they store pages on Bitcoin blockchain.

Example: [Wayback] etched archive of nos.nl, 2021-09-10

Related:

--jeroen

Posted in archive.is / archive.today, Internet, InternetArchive, Power User, WayBack machine, Web Browsers | Leave a Comment »

The death of ESXi finally confirmed by Broadcom

Posted by jpluimers on 2024/02/12

Quite a few people already bumped into this the last two days (will add those links later), so today’s confirmation by Broadcom – who have a similar modus operandi as companies like Computer Associates and Symantec were and Idera is now – as of the ESXi death does not come as a surprise.

Read the rest of this entry »

Posted in Internet, InternetArchive, Power User, Virtualization, VMware, VMware ESXi, WayBack machine | Leave a Comment »

The Wayback Machine Chrome extension got a big update. Every journalist & researcher should install it ASAP! Faster URL archiving w/ customization, access to yr personal archive, and it tells you if the page you’re on has already been archived, etc.

Posted by jpluimers on 2023/12/05

Last year I learned about [Wayback/Archive] Wayback Machine – Chrome Web Store via a Twitter thread starting at

[Wayback/Archive] Craig Silverman on Twitter: “The Wayback Machine Chrome extension got a big update. Every journalist & researcher should install it ASAP! Faster URL archiving w/ customization, access to yr personal archive, and it tells you if the page you’re on has already been archived, etc. #osint”

I saved the full thread at [Wayback/Archive] Thread by @CraigSilverman on Thread Reader App:

Read the rest of this entry »

Posted in Bookmarklet, Chrome, Internet, InternetArchive, LifeHacker, OSINT - Open Source Intelligence, Power User, Uncategorized, WayBack machine, Web Browsers | Leave a Comment »

Avoid VirtualBox; use Hyper-V or VMware in stead

Posted by jpluimers on 2023/11/10

A while ago, Jilles found out why not to use VirtualBox: [Wayback/Archive] Jilles🏳️‍🌈 on Twitter: “@jpluimers Ik wil op basis van wat de Arch community schreeuwt; “Virtualbox is stom, als je geen hyper-v gebruikt vraag je om problemen”, HYPER-V maar gaan proberen.” / Twitter

The biggest problem is that VirtualBox seems to be developed ant tested for the happy path, not the failing path.

Which means that when you use it for less common scenarios, it will often fail in mysterious ways.

Back in Running ArchiveTeam Warrior version 3.2 on ESXi, I already mentioned this:

Totally agreeing with Kristian Kohntopp, I do not understand why people use VirtualBox at all: I just run in too much issues like [Archive.is] Kristian Köhntopp on Twitter: “Hint: Wenn die Installation einer Linux-Distro in Virtualbox mit wechselnden, unbekannten Fehlern scheitert, hilft es, stattdessen einmal VMware Workstation oder kvm zu probieren. In meinem Fall hat es dann *jedes* *einzelne* *Mal* mit *demselben* Iso geklappt.”.

Read the rest of this entry »

Posted in *nix, *nix-tools, ArchiveTeamWarrior, Hyper-V, InternetArchive, Linux, Power User, VirtualBox, Virtualization, VMware, WayBack machine, Windows, Windows 10, Windows 11 | 1 Comment »

Bookmarklet to navigate from a page to the most recent saved WayBack machine entry

Posted by jpluimers on 2023/10/04

A while ago, while writing last weeks post XPath based bookmarklets for Archive.is: more JavaScript fiddling!, I needed the most recent WayBack Machine archival of

https://developer.mozilla.org/en-US/docs/Web/XPath/Introduction_to_using_XPath_in_JavaScript

I vaguely remembered replacing the normal timestamp with a 3 and 13 zeros, so I tried this

https://web.archive.org/web/30000000000000/https://developer.mozilla.org/en-US/docs/Web/XPath/Introduction_to_using_XPath_in_JavaScript

And indeed, it did a HTTP 302 redirect to

https://web.archive.org/web/20220312161117/https://developer.mozilla.org/en-US/docs/Web/XPath/Introduction_to_using_XPath_in_JavaScript

So I quickly made this bookmarklet:

javascript:location.href='https://web.archive.org/web/30000000000000/'+document.location.href;

Then I created another one for getting the screenshot:

javascript:location=location.href.replace(/^https:\/\/web\.archive\.org\/save\/http/,'https://web.archive.org/web/30000000000000/http://web.archive.org/screenshot/http')

That works for screenshots archived with a Wayback Machine account, as these are related because of the inserted http://web.archive.org/screenshot/ fragment:

Since the Wayback Machine always looks for the closest savet timestamp, it does not matter the timestamps in these archived pages have a slight mismatch.

Memory lane

20231006: I edited this section referring two prior blog posts instead of one because of [Wayback/Archive] pbeccard: “@wiert @oliof You can also use…” – Mastodon (clearly showing that Mastodon like any social media platform does mangle backtick quoted code):

@wiert @oliof You can also use `javascript:location.href=’web.archive.org/web/*/’+docume to get the overview. I find this quite useful since I often want an older version of a page.

And later in the reply chain:

[Wayback/Archive] pbeccard: “@wiert @oliof Ah, I thought b…” – Mastodon

@wiert @oliof Ah, I thought by now that maybe Markdown is supported. I pulled the bookmarklet out of my bookmarklet bookmark folder. Here is a copy: https://gist.github.com/corppneq/d61e3…

[Wayback/Archive] Gist: Bookmarklets

I also found back two blog posts:

  1. Need to write a proper bookmarklet for the wayback archive (: mentioning many useful Wayback Machine JavaScript Bookmarklets from my gist [Wayback/Archive] Ideas/inspiration for writing a proper WayBack archive.org bookmarklet including this one:

    [Wayback/Archive] http://www.gyford.com/misc/wayback.html

      • WayBack:

        javascript:location.href='http://web.archive.org/web/*/'+document.location.href;
        

    I also archived this referred page: [Wayback/Archive] Bookmarklets.com – What’s New.

  2. JavaScript bookmarklet to replace part of the WayBack machine URL with a bookmarklet replacing

    JavaScript bookmarklet to replace part of the WayBack machine URL:

    A bookmarklet that goes to the latest rendered saved version (sometimes saved versions have not been rendered yet, so you get the latest available render):

    javascript:location=location.href.replace(/^https:\/\/web\.archive\.org\/save\/http/,'https://web.archive.org/web/30000000000000/http')

    The WayBack Machine uses a 14-position ID and tries to find the render that is the most close by. This is the format of the ID:

    yyyymmddhhmmss

    This is granular enough, as the WayBack machine only allows new saves that are usually 30+ minutes apart.

    (Note that period by now seems to be increased from 30+ minutes to 45+ minutes)

It also found back this post having the same huge number: 0.30000000000000004.com. How cool is WordPress search (:

–jeroen

Posted in Bookmarklet, Development, Internet, InternetArchive, JavaScript/ECMAScript, Power User, Scripting, Software Development, WayBack machine, Web Browsers | Leave a Comment »

Bookmarklet for Archive.is to navigate to the canonical link

Posted by jpluimers on 2023/08/15

This is a follow-up to Bookmarklets for Archive.is and the WayBack Machine to go to the original page.

Archive.is has two kinds of URLs:

  1. The encoded version is the short form without any meta-information,
  2. The canonical version is a long form and has metadata about Archive date and time, and the Archived URL,

You get the first URL both after archiving and when browsing from an archived page to another archived page (if it is not archived you will go the unarchived full page URL).

Read the rest of this entry »

Posted in archive.is / archive.today, Development, Internet, InternetArchive, JavaScript/ECMAScript, Power User, Scripting, Software Development, WayBack machine | Leave a Comment »

Working around Archive.is/.today/.ph/.li/.vn/.fo/.md eternal spinner “Loading” when trying to archive a page

Posted by jpluimers on 2023/01/13

I have had the below Archive.is spinner “Loading” without any progress indication on a couple of URLs the last few months and I think they are tied to having special characters in the URL-to-be-archived.

My usual workaround was to first archive in the Wayback Machine, then archive the resulting URL in Archive.is as it would automatically follow the path up to the original URL,

That of course failed when  https://web.archive.org/web/*/vx-underground.org did not want to save in Archive.is: either these would give an eternal spinner on the “Loading” page no matter the browser you were using either the escaped %2A or *:

Read the rest of this entry »

Posted in archive.is / archive.today, Conference Topics, Conferences, Event, Internet, InternetArchive, LifeHacker, Power User, WayBack machine | Leave a Comment »