The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 4,262 other subscribers

How to access Archive.org’s Google+ communities archive? : googleplus

Posted by jpluimers on 2022/03/15

On my research list: [Wayback/Archive.is] How to access Archive.org’s Google+ communities archive? : googleplus, as there are so many interesting programming related posts there.

The main takeaway is that in order to access an archived Google+ post, you need to know or be able to reconstruct the canonical URL with language specifier to the Google+ post, see the comment in the first related link below.

It looks that for my archived profile links Wayback – Google+: Jeroen Wiert Pluimers (UUID) / Wayback – Google+: Jeroen Wiert Pluimers (user name) only some 30 links were archived directly through the WayBack save-as feature based on my UUID and some 250 based on my username profile:

. Hopefully more

Some related links:

  1. [Wayback/Archive.is] dredmorbius comments on How to access Archive.org’s Google+ communities archive?

    dredmorbius Author of the article you’ve linked.

    Unfortunately, no, there’s not a really good way of finding content on the Internet Archive’s WBM, unless you already know the URL(s) you’re looking for.

    Keep in mind that:

    • Not everything got captured. I’ve been having a discussion with another G+ user over this, and spot-checking multiple URLs finds no archive of many.

    • There are several variants of G+ post URLs. You want the one with the 20-digit numeric UUID, and NOT the “vanity url” +FirstnameLastname format.

    • Also strip out any instances of /u/[0-9]+/ within the URL. E.g., if you see “https://plus.google.com/u/0/<UUID>”, change that to “https://plus.google.com/<UUID>” (where UUID is the numeric user string).

    • User profile homepages are frequently archived, but the visible posts cannot themselves be opened. This is … unfortunate.

    • Similarly: only the first page of an infinite scroll of User, Brand, Collection, Community, etc., pages is captured. Unless there are multiple captures over time, you’re not going to get a full user history there.

    Generally, your best bet is to have some link to G+ content that you can convert to the appropriate format as Internet Archive might have saved, and check to see if it’s stored. Again, this is tedious, though at least in many cases, useful.

    There’s a list of some of the more notable G+ users and Communities at PlexodusWiki which may also be helpful in tracking down specific references.

    Also: it turns out that slight variations in URL format can mean you do or don’t find a page.

    I just ran into this trying to track down a post and discovered that the URL arguments — here a language specifier — are critical in returning the intended post.

    Discussion: https://mastodon.cloud/@dredmorbius/103592826938741244

    The fully qualified G+ URL is found: https://web.archive.org/web/20190325032955/https://plus.google.com/104092656004159577193/posts/4REjF1smHpE?hl=en

    But stripping off ?hl=en, even when wildcarded, is not:

    https://web.archive.org/web/2019*/https://plus.google.com/104092656004159577193/posts/4REjF1smHpE

    Unfortunately, the IA’s WBM requires JS to return content, which means that simple means of testing with common shell tools in scripts (allowing a large number of candidate URIs to be checked quickly) isn’t possible.

  2. [Wayback] Doc Edward Morbius ❌​: “@woozle@toot.cat You might also try appending “?h…” – mastodon.cloud
  3. [Wayback] G+ Notable Communities Database – PlexodusWiki
  4. [Wayback] Google+ tracker – #googleminus – Donate at https://archive.org/donate/ for hosting the archives Dashboard
  5. [Wayback/Archive.is] Saving of public Google+ content at the Internet Archive’s Wayback Machine by the Archive Team has begun : plexodus
  6. GitHub – ArchiveTeam/googleplus-grab: Archiving Google+.
  7. [Wayback/Archive.is] Plexodus: The Google+ Exodus subreddit : plexodus
  8. [Wayback/Archive.is] Internet Data Is Rotting | Hacker News

–jeroen

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.