The JavaScript bookmarklets that saved me a lot of time documenting the Embarcadero docwiki outage

September 2023
M	T	W	T	F	S	S
	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Posted by jpluimers on 2023/09/28

Winter 2022, the Embarcadero docwiki (their most active site which contains all documentation for all their products) was down. Twice. First for a week, then parts of it for almost a week, then only parts of the Alexandria got up in a stable way.

Back then I published The Delphi documentation site docwiki.embarcadero.com has been down/up oscillating for 4 days is now down for almost a day.. The product and library documentation for the most recent version got back up in a week, but the Code Examples and older product versions took much longers.

Usually once learns way more about a system when it is failing then when it is working. That was the case this system as well.

Documenting the failing system took considerable time, but would have taken way more if not for these two JavaScript browser bookmarklets:

Archiving a page in Archive.is (as the Wayback Machine does not archive web pages throwing http errors):
```
javascript:void(open('https://archive.is/?run=1&url='+encodeURIComponent(document.location)))
```

From the archived page, create an html list-item with link to archived and actual page plus some information from the page:

javascript:{
function x(xpath, parent) {
  result = document.evaluate(xpath, parent || document, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);
  nodes = []
  while (node = result.iterateNext())
    nodes.push(node);
  return nodes;
}
aua=document.createElement("a");
oua=document.createElement("a");
aua.href=document.querySelector('link[rel="canonical"]')?.href;
o=document.querySelectorAll('input[readonly]')[0]?.value;
o=o??document.getElementsByName("q")[0]?.value;
oua.href=o;
aua.text="Archive";
oua.text=document.title;
aua.target="_blank";
oua.target="_blank";
aua.rel="noopener";
oua.rel="noopener";
ouaps=oua.pathname.split("/");
r=new Intl.DisplayNames(['en'], {type: 'language'});
l=ouaps[3];
l=r.of(l==="e"?"en":l);
s=ouaps[4]??'';
if(!!s){s=" "+decodeURI(s)};
divs=x('//div[contains(., "1146")]');
d=divs[divs.length-1]?.textContent.split("`")[1]??'';
if(!!d){d=` <code>${d}</code>`};
li=`<li>[${aua.outerHTML}] ${oua.outerHTML} ${l} ${ouaps[2]} ${ouaps[1]}${s}${d}</li>`;
prompt("li", li);
}

I mentioned the first in Archive.is is more like a thread unroll service than an archival service and it is pretty straightforward: it converts the current URL (which is in document.location) in URL encoded form (using encodeURIComponent), then appends it to https://archive.is/?run=1&url= and opens a new browser tab with it.

The latter is important as when archiving more than a few pages at a time, Archive.is will show a captcha (sometimes again after a bunch of pages) and won’t save any page until you have solved the captcha.

The second bookmarklet is way more complex. First, it uses abbreviates variable names to keep it short. In retrospect, I should have made them longer, so here is the translation table:

Short Long Description

aua archivedUrlAnchor HTML anchor having the Archive.is canonical URL of the page

oua originalUrlAnchor HTML anchor having the original URL of the page

ouaps originalUrlAnchorPathsSplitted Path portion of the original URL of the page splitted on / boundaries

r regionHelper Helper to convert 2-character language code into readable English form

l language

s suffix

d databaseName database name of the l10n_cache database (used for localisation/l10n)

Short	Long	Description
`aua`	`archivedUrlAnchor`	HTML anchor having the Archive.is canonical URL of the page
`oua`	`originalUrlAnchor`	HTML anchor having the original URL of the page
`ouaps`	`originalUrlAnchorPathsSplitted`	Path portion of the original URL of the page splitted on `/` boundaries
`r`	`regionHelper`	Helper to convert 2-character language code into readable English form
`l`	`language`
`s`	`suffix`
`d`	`databaseName`	database name of the `l10n_cache` database (used for localisation/l10n)

Second, it uses quite a few JavaScript language tricks and framework knowledge to keep things short.

Let’s dig into these.

Getting data from the document

Filling the URLs is done in the below code which I already explained in Source: Bookmarklet for Archive.is to navivate to the canonical link and Bookmarklets for Archive.is and the WayBack Machine to go to the original page:

aua.href=document.querySelector('link[rel="canonical"]')?.href;
o=document.querySelectorAll('input[readonly]')[0]?.value;
o=o??document.getElementsByName("q")[0]?.value;
oua.href=o;

The third line can also be replaced with this one:

o??=document.getElementsByName("q")[0]?.value;

This uses three operators that have to do with the JavaScript concept Nullish. Before digging into Nullish however, first let me point back to the two query functions I explained for Archive.is in Bookmarklets for Archive.is and the WayBack Machine to go to the original page.

The operators are:

[Wayback/Archive] Optional chaining (?.) – JavaScript | MDN:

The optional chaining operator (?.) enables you to read the value of a property located deep within a chain of connected objects without having to check that each reference in the chain is valid.

The ?. operator is like the . chaining operator, except that instead of causing an error if a reference is nullish (null or undefined), the expression short-circuits with a return value of undefined. When used with function calls, it returns undefined if the given function does not exist.
[Wayback/Archive] Nullish coalescing operator (??) – JavaScript | MDN

The nullish coalescing operator (??) is a logical operator that returns its right-hand side operand when its left-hand side operand is null or undefined, and otherwise returns its left-hand side operand.

This can be contrasted with the logical OR (||) operator, which returns the right-hand side operand if the left operand is any falsy value, not only null or undefined.
[Wayback/Archive] Logical nullish assignment (??=) – JavaScript | MDN

The logical nullish assignment (x ??= y) operator only assigns if x is nullish (null or undefined).

I will dig into Nullish in a few headings.

This bit uses to JavaScript operators for handling undefined/null values, which together JavaScript calls Nullish and are a subset of Falsy:

[Wayback/Archive] Nullish value – MDN Web Docs Glossary: Definitions of Web-related terms | MDN

In JavaScript, a nullish value is the value which is either null or undefined. Nullish values are always falsy.

[Wayback/Archive] Falsy – MDN Web Docs Glossary: Definitions of Web-related terms | MDN

A falsy (sometimes written falsey) value is a value that is considered false when encountered in a Boolean context.

JavaScript uses type conversion to coerce any value to a Boolean in contexts that require it, such as conditionals and loops.

The following table provides a complete list of JavaScript falsy values:

Value Description

false The keyword false.

0 The Number zero (so, also 0.0, etc., and 0x0).

-0 The Number negative zero (so, also -0.0, etc., and -0x0).

0n The BigInt zero (so, also 0x0n). Note that there is no BigInt negative zero — the negation of 0n is 0n.

"", '', `` Empty string value.

null null — the absence of any value.

undefined undefined — the primitive value.

NaN NaN — not a number.

document.all Objects are falsy if and only if they have the [[IsHTMLDDA]] internal slot.That slot only exists in document.all and cannot be set using JavaScript.

Value	Description
`false`	The keyword `false`.
`0`	The `Number` zero (so, also `0.0`, etc., and `0x0`).
`-0`	The `Number` negative zero (so, also `-0.0`, etc., and `-0x0`).
`0n`	The `BigInt` zero (so, also `0x0n`). Note that there is no `BigInt` negative zero — the negation of `0n` is `0n`.
`""`, `''`, ``	Empty string value.
null	null — the absence of any value.
undefined	undefined — the primitive value.
NaN	NaN — not a number.
`document.all`	Objects are falsy if and only if they have the [[IsHTMLDDA]] internal slot.That slot only exists in `document.all` and cannot be set using JavaScript.

I included the whole table as the the if(!!s) trick trick above is also based on it.

In addition to the ?. optional chaining operator to handle the nullish cases, this code also uses the [Wayback/Archive] Nullish coalescing operator (??) – JavaScript | MDN:

The nullish coalescing operator (??) is a logical operator that returns its right-hand side operand when its left-hand side operand is null or undefined, and otherwise returns its left-hand side operand.

I wrote a tiny bit about both operators before in Bookmarklets for Archive.is and the WayBack Machine to go to the original page, but it is worth repeating in more detail here as the concept is crucial and used often.

Postprocessing the data: the right format part 1

The docwiki URLs use standard 2-character language codes which you can find back in List of ISO 639-1 codes – Wikipedia. The conversion is being done through [Wayback/Archive] Intl.DisplayNames – JavaScript | MDN:

The Intl.DisplayNames object enables the consistent translation of language, region and script display names.

Sometimes however, the docwiki uses invalid language codes:

An URL like https://docwiki.embarcadero.com/CodeExamples/Sydney/de/Special:Search/Main%20Page redirects to https://docwiki.embarcadero.com/CodeExamples/Sydney/e/Special:Search/Main%20Page which uses e as language code (which un turn is not a valid language).

Sometimes this reflects in the Archive.is archival, for instance compare these two:

archive.ph/QRWSq -> https://archive.ph/2022.03.10-081457/https://docwiki.embarcadero.com/CodeExamples/Sydney/de/Special:Search/Main%20Page

Saved from
https://docwiki.embarcadero.com/CodeExamples/Sydney/de/Special:Search/Main%20Page

Not Found

The requested URL /CodeExamples/Sydney/de/Special:Search/Main Page was not found on this server.

https://archive.ph/2DW4p -> https://archive.ph/2022.03.10-081452/https://docwiki.embarcadero.com/CodeExamples/Rio/e/Special:Search/Main%20Page

Saved from
https://docwiki.embarcadero.com/CodeExamples/Rio/e/Special:Search/Main%20Page

Redirected from
https://docwiki.embarcadero.com/CodeExamples/Rio/de/Special:Search/Main%20Page

Not Found

The requested URL /CodeExamples/Rio/e/Special:Search/Main Page was not found on this server.

Solving this is done in this line:

l=r.of(l==="e"?"en":l);

It uses both the operator [Wayback/Archive] Strict equality (===) – JavaScript | MDN

The strict equality operator (===) checks whether its two operands are equal, returning a Boolean result. Unlike the equality operator, the strict equality operator always considers operands of different types to be different.

and the [Wayback/Archive] Conditional (ternary) operator – JavaScript | MDN

The conditional (ternary) operator is the only JavaScript operator that takes three operands: a condition followed by a question mark (?), then an expression to execute if the condition is truthy followed by a colon (:), and finally the expression to execute if the condition is falsy. This operator is frequently used as an alternative to an if...else statement.

Lax typing is both a strength and weakness of JavaScript. Hence the usage of both the === operator and if(!!s) trick in my code.

Let’s dig in the latter now.

Postprocessing: more nullish values

Depending on the original page, not all bits are present in the archived page. When not they are either undefined or null, which JavaScript collectively calls Nullish and is very similar to Falsy (for both terms, see the references below).

Furthermore, the information might not be in the right format.

Both are solved with these JavaScript tricks:

if(!!s){s=" "+decodeURI(s)};

Sometimes s is of the form Main%20Page which is Main Page in URL-encoding and can be decoded using [Wayback/Archive] decodeURI() – JavaScript | MDN.

Sometimes s has no value. The if(!!s) trick covers that (it is based on Falsy, which is explained below) and executes the {s=" "+decodeURI(s)} part only if s has a value. The first bit can also be if(Boolean(s)). Handling Nullish values is a common problem and is explained by [Wayback/Archive] karthick.sk in [Wayback/Archive] How can I check for an empty/undefined/null string in JavaScript? – Stack Overflow (thanks [Wayback/Archive] casademora for asking!):

All the previous answers are good, but this will be even better. Use dual NOT operators (!!):
if (!!str) {
    // Some code here
}
Or use type casting:
if (Boolean(str)) {
    // Code here
}
Both do the same function. Typecast the variable to Boolean, where str is a variable.

It returns false for null, undefined, 0, 000, "", false.

It returns true for all string values other than the empty string (including strings like "0" and " ")

There are many other ways to check for this. Some performance measurements have been done by [Wayback/Archive] Kamil Kiełczewski at [Wayback/Archive] How can I check for an empty/undefined/null string in JavaScript? – Stack Overflow:

I perform tests on macOS v10.13.6 (High Sierra) for 18 chosen solutions. Solutions works slightly different (for corner-case input data) which was presented in the snippet below.

Conclusions

the simple solutions based on !str,==,=== and length are fast for all browsers (A,B,C,G,I,J)

the solutions based on the regular expression (test,replace) and charAt are slowest for all browsers (H,L,M,P)

the solutions marked as fastest was fastest only for one test run – but in many runs it changes inside ‘fast’ solutions group

Back to my code.

Getting the database name

That is: if there is one. Not all saved pages had an error on them indicating the database name. The ones that did look like [Archive] Internal error – RAD Studio: XE8 main page:

[53d58941e2d881306538a66d] /RADStudio/XE8/en/Main_Page WikimediaRdbmsDBQueryError from line 1457 of /var/www/html/shared/BaseWiki31/includes/libs/rdbms/database/Database.php: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading?
Query: SELECT lc_value FROM `rad_xe8_en_l10n_cache` WHERE lc_lang = 'en' AND lc_key = 'deps' LIMIT 1
Function: LCStoreDB::get
Error: 1146 Table 'wikidb.rad_xe8_en_l10n_cache' doesn't exist (10.50.1.120)
Backtrace:

This line gets the div elements (see (see HTML element: div – Wikipedia)) containing 1146:

divs=x('//div[contains(., "1146")]');

The x function is a condensed version of the getElementsByXPath I described last week in XPath based bookmarklets for Archive.is: more JavaScript fiddling!.

Note it uses the || to perform default assignment as shown in in [Wayback/Archive] 3 Ways to Set Default Value in JavaScript | SamanthaMing.com.

More Nullish undefined/null handling

There is another bit which has to do with values:

d=divs[divs.length-1]?.textContent.split("`")[1]??'';

One statement having both the ?. and ?? operator to handle both cases of Nullish: the div not existing, or the split not returning enough elements.

Wut, no regex?

When you look at the above code, there are no regular expressions in them.

I hesitated a bit when writing this part:

d=divs[divs.length-1]?.textContent.split("`")[1]??'';

It assumes the empirically observed pattern that the last div in the result contained the language name within back-ticks. If there was one at all. Doing it in regex would be at least as complex and require yet another language skill.

Anchor assembly

The easiest way to build an HTML anchor is by using an instance of [Archive/Archive] HTMLAnchorElement – Web APIs | MDN. These lines use the various properties of it:

aua.href=document.querySelector('link[rel="canonical"]')?.href;
oua.href=ou=document.getElementsByName("q")[0]?.value;
aua.text="Archive";
oua.text=document.title;
aua.target="_blank";
oua.target="_blank";
aua.rel="noopener";
oua.rel="noopener";
ouaps=oua.pathname.split("/");

Creation is easily done through [Wayback/Archive] Document.createElement() – Web APIs | MDN, but you have to know that a maps to HTMLAnchorElement:

aua=document.createElement("a");
oua=document.createElement("a");

String assembly

Assembling strings can be a tedious job. I prefer to use backticks for this as it allows to embed the JavaScript expression within the template string which are in all the bolded parts below:

li=`<li>[${aua.outerHTML}] ${oua.outerHTML} ${l} ${ouaps[2]} ${ouaps[1]}${s}${d}</li>`;

The mechanism is called [Wayback/Archive] Template literals (Template strings) – JavaScript | MDN

Template literals are string literals allowing embedded expressions. You can use multi-line strings and string interpolation features with them.

They were called “template strings” in prior editions of the ES2015 specification.

I briefly mentioned them 5 years ago in I wish the Delphi language supported multi-line strings and were part of a few examples in the more recent post Source: For my reading list: some links on Twitter bookmarklets.

The are cool and make code a lot more readable: it is immediately clear how the List Element (see HTML element: li – Wikipedia) is being built.

--jeroen

javascript:void(open('https://archive.is/?run=1&url='+encodeURIComponent(document.location)))

view raw

archive.is-save.js

hosted with ❤ by GitHub

	javascript:{
	function x(xpath, parent) {
	result = document.evaluate(xpath, parent \|\| document, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);
	nodes = []
	while (node = result.iterateNext())
	nodes.push(node);
	return nodes;
	}
	aua=document.createElement("a");
	oua=document.createElement("a");
	aua.href=document.querySelector('link[rel="canonical"]')?.href;
	o=document.querySelectorAll('input[readonly]')[0]?.value;
	o=o??document.getElementsByName("q")[0]?.value;
	oua.href=o;
	aua.text="Archive";
	oua.text=document.title;
	aua.target="_blank";
	oua.target="_blank";
	aua.rel="noopener";
	oua.rel="noopener";
	ouaps=oua.pathname.split("/");
	r=new Intl.DisplayNames(['en'], {type: 'language'});
	l=ouaps[3];
	l=r.of(l==="e"?"en":l);
	s=ouaps[4]??'';
	if(!!s){s=" "+decodeURI(s)};
	divs=x('//div[contains(., "1146")]');
	d=divs[divs.length-1]?.textContent.split("`")[1]??'';
	if(!!d){d=` <code>${d}</code>`};
	li=`<li>[${aua.outerHTML}] ${oua.outerHTML} ${l} ${ouaps[2]} ${ouaps[1]}${s}${d}</li>`;
	prompt("li", li);
	}

view raw

saved-archive.is-to-html-li.js

hosted with ❤ by GitHub

This entry was posted on 2023/09/28 at 12:00 and is filed under Bookmarklet, Conference Topics, Conferences, Development, Event, JavaScript/ECMAScript, Power User, Scripting, Software Development, Web Browsers. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

	Attila Kovacs on Crowbarring Windows 95 into Wi…
	Jeroen Wiert Pluimer… on Does Odido (the old T-Mobile N…
	Lars Fosdal on Security alarm provider Woonve…
	Thomas Mueller on Question got closed in May 202…
	Thaddy de Koning on Formulier voor bewindvoerders…

The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

Subscribe

Archives

Recent Comments

Recent Posts

Blog Stats

Meta title

Tag Cloud Title

Top Clicks

Top Posts

My badges

Twitter Updates

My Flickr Stream

Pages

All categories

Email Subscription