The JavaScript bookmarklets that saved me a lot of time documenting the Embarcadero docwiki outage
Posted by jpluimers on 2023/09/28
Winter 2022, the Embarcadero docwiki (their most active site which contains all documentation for all their products) was down. Twice. First for a week, then parts of it for almost a week, then only parts of the Alexandria got up in a stable way.
Back then I published The Delphi documentation site docwiki.embarcadero.com has been down/up oscillating for 4 days is now down for almost a day.. The product and library documentation for the most recent version got back up in a week, but the Code Examples and older product versions took much longers.
Usually once learns way more about a system when it is failing then when it is working. That was the case this system as well.
Documenting the failing system took considerable time, but would have taken way more if not for these two JavaScript browser bookmarklets:
- Archiving a page in Archive.is (as the Wayback Machine does not archive web pages throwing http errors):
javascript:void(open('https://archive.is/?run=1&url='+encodeURIComponent(document.location))) - From the archived page, create an html list-item with link to archived and actual page plus some information from the page:
javascript:{ function x(xpath, parent) { result = document.evaluate(xpath, parent || document, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null); nodes = [] while (node = result.iterateNext()) nodes.push(node); return nodes; } aua=document.createElement("a"); oua=document.createElement("a"); aua.href=document.querySelector('link[rel="canonical"]')?.href; o=document.querySelectorAll('input[readonly]')[0]?.value; o=o??document.getElementsByName("q")[0]?.value; oua.href=o; aua.text="Archive"; oua.text=document.title; aua.target="_blank"; oua.target="_blank"; aua.rel="noopener"; oua.rel="noopener"; ouaps=oua.pathname.split("/"); r=new Intl.DisplayNames(['en'], {type: 'language'}); l=ouaps[3]; l=r.of(l==="e"?"en":l); s=ouaps[4]??''; if(!!s){s=" "+decodeURI(s)}; divs=x('//div[contains(., "1146")]'); d=divs[divs.length-1]?.textContent.split("`")[1]??''; if(!!d){d=` <code>${d}</code>`}; li=`<li>[${aua.outerHTML}] ${oua.outerHTML} ${l} ${ouaps[2]} ${ouaps[1]}${s}${d}</li>`; prompt("li", li); }
I mentioned the first in Archive.is is more like a thread unroll service than an archival service and it is pretty straightforward: it converts the current URL (which is in document.location) in URL encoded form (using encodeURIComponent), then appends it to https://archive.is/?run=1&url= and opens a new browser tab with it.
The latter is important as when archiving more than a few pages at a time, Archive.is will show a captcha (sometimes again after a bunch of pages) and won’t save any page until you have solved the captcha.
The second bookmarklet is way more complex. First, it uses abbreviates variable names to keep it short. In retrospect, I should have made them longer, so here is the translation table:
Short Long Description auaarchivedUrlAnchorHTML anchor having the Archive.is canonical URL of the page ouaoriginalUrlAnchorHTML anchor having the original URL of the page ouapsoriginalUrlAnchorPathsSplittedPath portion of the original URL of the page splitted on /boundariesrregionHelperHelper to convert 2-character language code into readable English form llanguagessuffixddatabaseNamedatabase name of the l10n_cachedatabase (used for localisation/l10n)
Second, it uses quite a few JavaScript language tricks and framework knowledge to keep things short.
Let’s dig into these.
Getting data from the document
Filling the URLs is done in the below code which I already explained in Source: Bookmarklet for Archive.is to navivate to the canonical link and Bookmarklets for Archive.is and the WayBack Machine to go to the original page:
aua.href=document.querySelector('link[rel="canonical"]')?.href; o=document.querySelectorAll('input[readonly]')[0]?.value; o=o??document.getElementsByName("q")[0]?.value; oua.href=o;
The third line can also be replaced with this one:
o??=document.getElementsByName("q")[0]?.value;
This uses three operators that have to do with the JavaScript concept Nullish. Before digging into Nullish however, first let me point back to the two query functions I explained for Archive.is in Bookmarklets for Archive.is and the WayBack Machine to go to the original page.
The operators are:
- [Wayback/Archive] Optional chaining (?.) – JavaScript | MDN:
The optional chaining operator (?.) enables you to read the value of a property located deep within a chain of connected objects without having to check that each reference in the chain is valid. - [Wayback/Archive] Nullish coalescing operator (??) – JavaScript | MDN
The nullish coalescing operator (??) is a logical operator that returns its right-hand side operand when its left-hand side operand isnullorundefined, and otherwise returns its left-hand side operand.This can be contrasted with the logical OR (||) operator, which returns the right-hand side operand if the left operand is any falsy value, not onlynullorundefined. - [Wayback/Archive] Logical nullish assignment (??=) – JavaScript | MDN
The logical nullish assignment (
x ??= y) operator only assigns ifxis nullish (nullorundefined).
I will dig into Nullish in a few headings.
This bit uses to JavaScript operators for handling undefined/null values, which together JavaScript calls Nullish and are a subset of Falsy:
- [Wayback/Archive] Nullish value – MDN Web Docs Glossary: Definitions of Web-related terms | MDN
In JavaScript, a nullish value is the value which is either
nullorundefined. Nullish values are always falsy. - [Wayback/Archive] Falsy – MDN Web Docs Glossary: Definitions of Web-related terms | MDN
A falsy (sometimes written falsey) value is a value that is considered false when encountered in a Boolean context.JavaScript uses type conversion to coerce any value to a Boolean in contexts that require it, such as conditionals and loops.The following table provides a complete list of JavaScript falsy values:Value Description falseThe keyword false.0The Numberzero (so, also0.0, etc., and0x0).-0The Numbernegative zero (so, also-0.0, etc., and-0x0).0nThe BigIntzero (so, also0x0n). Note that there is noBigIntnegative zero — the negation of0nis0n."",'',``Empty string value. null null — the absence of any value. undefined undefined — the primitive value. NaN NaN — not a number. document.allObjects are falsy if and only if they have the [[IsHTMLDDA]] internal slot.That slot only exists in document.alland cannot be set using JavaScript.I included the whole table as the the
if(!!s)trick trick above is also based on it.
In addition to the ?. optional chaining operator to handle the nullish cases, this code also uses the [Wayback/Archive] Nullish coalescing operator (??) – JavaScript | MDN:
The nullish coalescing operator (
??) is a logical operator that returns its right-hand side operand when its left-hand side operand isnullorundefined, and otherwise returns its left-hand side operand.
I wrote a tiny bit about both operators before in Bookmarklets for Archive.is and the WayBack Machine to go to the original page, but it is worth repeating in more detail here as the concept is crucial and used often.
Postprocessing the data: the right format part 1
The docwiki URLs use standard 2-character language codes which you can find back in List of ISO 639-1 codes – Wikipedia. The conversion is being done through [Wayback/Archive] Intl.DisplayNames – JavaScript | MDN:
The
Intl.DisplayNamesobject enables the consistent translation of language, region and script display names.
Sometimes however, the docwiki uses invalid language codes:
An URL like https://docwiki.embarcadero.com/CodeExamples/Sydney/de/Special:Search/Main%20Page redirects to https://docwiki.embarcadero.com/CodeExamples/Sydney/e/Special:Search/Main%20Page which uses
eas language code (which un turn is not a valid language).Sometimes this reflects in the Archive.is archival, for instance compare these two:
- archive.ph/QRWSq -> https://archive.ph/2022.03.10-081457/https://docwiki.embarcadero.com/CodeExamples/Sydney/
de/Special:Search/Main%20Page
Saved from
https://docwiki.embarcadero.com/CodeExamples/Sydney/de/Special:Search/Main%20PageNot Found
The requested URL /CodeExamples/Sydney/de/Special:Search/Main Page was not found on this server.- https://archive.ph/2DW4p -> https://archive.ph/2022.03.10-081452/https://docwiki.embarcadero.com/CodeExamples/Rio/e/Special:Search/Main%20Page
Saved from
https://docwiki.embarcadero.com/CodeExamples/Rio/e/Special:Search/Main%20PageRedirected from
https://docwiki.embarcadero.com/CodeExamples/Rio/de/Special:Search/Main%20PageNot Found
The requested URL /CodeExamples/Rio/e/Special:Search/Main Page was not found on this server.
Solving this is done in this line:
l=r.of(l==="e"?"en":l);
It uses both the operator [Wayback/Archive] Strict equality (===) – JavaScript | MDN
The strict equality operator (
===) checks whether its two operands are equal, returning a Boolean result. Unlike the equality operator, the strict equality operator always considers operands of different types to be different.
and the [Wayback/Archive] Conditional (ternary) operator – JavaScript | MDN
The conditional (ternary) operator is the only JavaScript operator that takes three operands: a condition followed by a question mark (
?), then an expression to execute if the condition is truthy followed by a colon (:), and finally the expression to execute if the condition is falsy. This operator is frequently used as an alternative to anif...elsestatement.
Lax typing is both a strength and weakness of JavaScript. Hence the usage of both the === operator and if(!!s) trick in my code.
Let’s dig in the latter now.
Postprocessing: more nullish values
Depending on the original page, not all bits are present in the archived page. When not they are either undefined or null, which JavaScript collectively calls Nullish and is very similar to Falsy (for both terms, see the references below).
Furthermore, the information might not be in the right format.
Both are solved with these JavaScript tricks:
if(!!s){s=" "+decodeURI(s)};
Sometimes s is of the form Main%20Page which is Main Page in URL-encoding and can be decoded using [Wayback/Archive] decodeURI() – JavaScript | MDN.
Sometimes s has no value. The if(!!s) trick covers that (it is based on Falsy, which is explained below) and executes the {s=" "+decodeURI(s)} part only if s has a value. The first bit can also be if(Boolean(s)). Handling Nullish values is a common problem and is explained by [Wayback/Archive] karthick.sk in [Wayback/Archive] How can I check for an empty/undefined/null string in JavaScript? – Stack Overflow (thanks [Wayback/Archive] casademora for asking!):
All the previous answers are good, but this will be even better. Use dual NOT operators (!!):if (!!str) { // Some code here }Or use type casting:if (Boolean(str)) { // Code here }Both do the same function. Typecast the variable to Boolean, wherestris a variable.
It returnsfalsefornull,undefined,0,000,"",false. It returnstruefor all string values other than the empty string (including strings like"0"and" ")
There are many other ways to check for this. Some performance measurements have been done by [Wayback/Archive] Kamil Kiełczewski at [Wayback/Archive] How can I check for an empty/undefined/null string in JavaScript? – Stack Overflow:
I perform tests on macOS v10.13.6 (High Sierra) for 18 chosen solutions. Solutions works slightly different (for corner-case input data) which was presented in the snippet below.Conclusions
- the simple solutions based on
!str,==,===andlengthare fast for all browsers (A,B,C,G,I,J)- the solutions based on the regular expression (
test,replace) andcharAtare slowest for all browsers (H,L,M,P)- the solutions marked as fastest was fastest only for one test run – but in many runs it changes inside ‘fast’ solutions group
Back to my code.
Getting the database name
That is: if there is one. Not all saved pages had an error on them indicating the database name. The ones that did look like [Archive] Internal error – RAD Studio: XE8 main page:
[53d58941e2d881306538a66d] /RADStudio/XE8/en/Main_Page WikimediaRdbmsDBQueryError from line 1457 of /var/www/html/shared/BaseWiki31/includes/libs/rdbms/database/Database.php: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? Query: SELECT lc_value FROM `rad_xe8_en_l10n_cache` WHERE lc_lang = 'en' AND lc_key = 'deps' LIMIT 1 Function: LCStoreDB::get Error: 1146 Table 'wikidb.rad_xe8_en_l10n_cache' doesn't exist (10.50.1.120) Backtrace:
This line gets the div elements (see (see HTML element: div – Wikipedia)) containing 1146:
divs=x('//div[contains(., "1146")]');
The x function is a condensed version of the getElementsByXPath I described last week in XPath based bookmarklets for Archive.is: more JavaScript fiddling!.
Note it uses the || to perform default assignment as shown in in [Wayback/Archive] 3 Ways to Set Default Value in JavaScript | SamanthaMing.com.
More Nullish undefined/null handling
There is another bit which has to do with values:
d=divs[divs.length-1]?.textContent.split("`")[1]??'';
One statement having both the ?. and ?? operator to handle both cases of Nullish: the div not existing, or the split not returning enough elements.
Wut, no regex?
When you look at the above code, there are no regular expressions in them.
I hesitated a bit when writing this part:
d=divs[divs.length-1]?.textContent.split("`")[1]??'';
It assumes the empirically observed pattern that the last div in the result contained the language name within back-ticks. If there was one at all. Doing it in regex would be at least as complex and require yet another language skill.
Anchor assembly
The easiest way to build an HTML anchor is by using an instance of [Archive/Archive] HTMLAnchorElement – Web APIs | MDN. These lines use the various properties of it:
aua.href=document.querySelector('link[rel="canonical"]')?.href; oua.href=ou=document.getElementsByName("q")[0]?.value; aua.text="Archive"; oua.text=document.title; aua.target="_blank"; oua.target="_blank"; aua.rel="noopener"; oua.rel="noopener"; ouaps=oua.pathname.split("/");
Creation is easily done through [Wayback/Archive] Document.createElement() – Web APIs | MDN, but you have to know that a maps to HTMLAnchorElement:
aua=document.createElement("a"); oua=document.createElement("a");
String assembly
Assembling strings can be a tedious job. I prefer to use backticks for this as it allows to embed the JavaScript expression within the template string which are in all the bolded parts below:
li=`<li>[${aua.outerHTML}] ${oua.outerHTML} ${l} ${ouaps[2]} ${ouaps[1]}${s}${d}</li>`;
The mechanism is called [Wayback/Archive] Template literals (Template strings) – JavaScript | MDN
Template literals are string literals allowing embedded expressions. You can use multi-line strings and string interpolation features with them.They were called “template strings” in prior editions of the ES2015 specification.
I briefly mentioned them 5 years ago in I wish the Delphi language supported multi-line strings and were part of a few examples in the more recent post Source: For my reading list: some links on Twitter bookmarklets.
The are cool and make code a lot more readable: it is immediately clear how the List Element (see HTML element: li – Wikipedia) is being built.
--jeroen
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| javascript:void(open('https://archive.is/?run=1&url='+encodeURIComponent(document.location))) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| javascript:{ | |
| function x(xpath, parent) { | |
| result = document.evaluate(xpath, parent || document, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null); | |
| nodes = [] | |
| while (node = result.iterateNext()) | |
| nodes.push(node); | |
| return nodes; | |
| } | |
| aua=document.createElement("a"); | |
| oua=document.createElement("a"); | |
| aua.href=document.querySelector('link[rel="canonical"]')?.href; | |
| o=document.querySelectorAll('input[readonly]')[0]?.value; | |
| o=o??document.getElementsByName("q")[0]?.value; | |
| oua.href=o; | |
| aua.text="Archive"; | |
| oua.text=document.title; | |
| aua.target="_blank"; | |
| oua.target="_blank"; | |
| aua.rel="noopener"; | |
| oua.rel="noopener"; | |
| ouaps=oua.pathname.split("/"); | |
| r=new Intl.DisplayNames(['en'], {type: 'language'}); | |
| l=ouaps[3]; | |
| l=r.of(l==="e"?"en":l); | |
| s=ouaps[4]??''; | |
| if(!!s){s=" "+decodeURI(s)}; | |
| divs=x('//div[contains(., "1146")]'); | |
| d=divs[divs.length-1]?.textContent.split("`")[1]??''; | |
| if(!!d){d=` <code>${d}</code>`}; | |
| li=`<li>[${aua.outerHTML}] ${oua.outerHTML} ${l} ${ouaps[2]} ${ouaps[1]}${s}${d}</li>`; | |
| prompt("li", li); | |
| } |






Leave a comment