The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,839 other subscribers

Archive for the ‘Mojibake’ Category

When you bump into Mojibake in your development, don’t use table-based solutions to solve it

Posted by jpluimers on 2026/05/14

A while ago I bumped into [Wayback/Archive] Unicode weirdness – VCL – Delphi-PRAXiS [en].

This sketched a mojibake problem where PDF to text converted files had odd looking character sequences.

The solution – replacing these sequences with more correctly looking text – worked at first, but then failed because the underlying source code got “corrected” from containing the Mojibake character sequences into the correct Unicode text.

A better solution is to figure out what series of encoding/decoding steps will give the correct text.

This is where – again – [Wayback/Archive] Home – ftfy: fixes text for you comes up: a still indispensable tool.

–jeroen

Posted in Delphi, Development, Encoding, Mojibake, PDF, Software Development | Leave a Comment »

Decoding HTML encoded source to XML text

Posted by jpluimers on 2026/03/03

For Some links on getting the most recent defragmentation time of a Windows volume I needed to copy back and forth some XML code back and forth between my ARM MacBook Pro to a remote Windows machine accessing via the Microsoft Windows App (the app formerly known as Microsoft Remote Desktop for Mac).

The problem with that is the copying would lose line breaks, which for XML meaning is no problem, but for human understandability while editing the XML in the Event View query dialog was.

So I decided to go to the “Code” view in my Classic WordPress editor (did I ever tell you much I dislike – especially the accessibility of – the not so new but still haughty named Gutenberg editor?), copied the HTML encoded form and wanted to convert it to unencoded XML text.

Well, here I got to naming confusion land, on which I will talk further below, but first two of the potential solutions:

Read the rest of this entry »

Posted in Cyberchef, Development, Encoding, HTML, Mojibake, Software Development, URL Encoding, Web Development | Leave a Comment »

Wijkcentrales – VVDSL.robinflikkema.nl

Posted by jpluimers on 2025/09/17

Voor mijn link archief: KPN telefooncentrales, waarvan een paar waar familie of ik een aansluiting had:

[Wayback/Archive] Wijkcentrales – VVDSL.robinflikkema.nl

Wijkcentrales
Code Plek
Ssh [Wayback/Archive] Wijkcentrale: Sassenheim – VVDSL.robinflikkema.nl
Asd-Bdh [Wayback/Archive] Wijkcentrale: Amsterdam-Badhoevedorp – VVDSL.robinflikkema.nl
Asd-Osdp [Wayback/Archive] Wijkcentrale: Amsterdam-Osdorp – VVDSL.robinflikkema.nl
Nhout [Wayback/Archive] Wijkcentrale: Noordwijkerhout – VVDSL.robinflikkema.nl
Lis [Wayback/Archive] Wijkcentrale: Lisse – VVDSL.robinflikkema.nl

Deze hadden Mojibake met de generic replacement character (“�”):

Wijkcentrales
Code Plek
Ctlr [Wayback/Archive] Wijkcentrale: Castelr� – VVDSL.robinflikkema.nl
(moet Castelré zijn)
Odi [Wayback/Archive] Wijkcentrale: St. Odili�nberg – VVDSL.robinflikkema.nl
(moet Sint Odiliënberg zijn)

Let op: de [Wayback/Archive] fourstack KPN UI (toenmalig gebouwd door [Wayback/Archive] FourStack) is sinds 2021 uit de lucht, zie [WaybackSave/ArchiveBad] FPI Fourstack Snelheid DSL – Internet en hosting – GoT , dus de gegevens worden niet meer bijgewerkt.

--jeroen

Posted in ADSL, Development, Encoding, Internet, ISDN, ISP, KPN, Mojibake, Power User, PSTN, Software Development, Telephony | Leave a Comment »

Sequoiaview altrnatives

Posted by jpluimers on 2025/06/12

I wrote about Sequoiaview in depth in SequoiaView Homepage, made some research notes in “cushion treemap” delphi – Google Search and touched it slightly in A choco install list.

I never heard back from my request for Sequoiaview source code, and given ever increasing local storage media sizes, the speed of it now has become an issue, so I started looking to see if more alternatives have appeared and what sets them apart.

TL;DR

  1. There is the open source WinDirStat that runs as non-admin and is about as slow as Sequoiaview
  2. There is the closed source but free for personal use WizTree that requires admin elevation and is much faster than Sequoiaview and WinDirStat

Neither of them allow for a view that is cushion treemap only.

The reason that WizTree is fast is that it directly uses the NTFS MFT (Master File Table) to read the information from. This requires elevated permissions.

This is the same mechanism used by the Everything search tool, but unlike Everything, WizTree:

Read the rest of this entry »

Posted in C++, Development, Encoding, Mojibake, Software Development, UTF-8, Windows Development | Tagged: | Leave a Comment »

Notities Warmink / Wuba staande klok met drie melodieëen en wijzers voor minuten+uren, seconden, weekdag, maand, maanstand

Posted by jpluimers on 2025/02/10

Inn de basis een door mijn opa gebouwde opwindbare (3 gewichten) Warmink Wuba triple chime – Westminster, St. Michael, Whittington clock.

Die liep nauwelijks meer, en sloeg zowel geen melodie meer, maar ook geen uursignaal meer.

Hieronder links die me hielpen met uitzoeken wat er van dit merk nog bestond en kennis aanwezig is.

Read the rest of this entry »

Posted in About, Development, DIY, Encoding, LifeHacker, Mojibake, Personal, Power User, Software Development | Leave a Comment »

As a tribute to their @isotopp handle history, Kris now changed its name to Köhntopp

Posted by jpluimers on 2024/12/17

[Wayback/Archive] Jeroen Wiert Pluimers: “LOL, just saw @isotopp changed…” – Mastodon

LOL, just saw @isotopp changed his name to Köhntopp

Well done, Kris. Well done.

ftfy.vercel.app/?s=ö

( the history of the iso isotopp handle is so great, that I was glad I captured it from Twitter before that content got deleted; it is now at wiert.me/2022/06/09/how-isotop )

This Vercel app cannot be archived in the Wayback Machine properly as it then returns a HTTP 500. The Archive.is save succeeded though: [Wayback/Archive] https://ftfy.vercel.app/?s=ö:

Read the rest of this entry »

Posted in Development, Encoding, ISO-8859, ISO8859, Mojibake, Software Development, Unicode, UTF-8 | Leave a Comment »

A while ago I bumped into some GPI Mojibake examples, but soon found out I should use the ftfy test cases

Posted by jpluimers on 2022/11/22

I have been into more and more Mojibake example pages like [Wayback] Mojibake: Question Marks, Strange Characters and Other Issues | GPI

Have you ever found strange characters like these ���  when viewing content in applications or websites in other languages?

They made me realise that all these (including the Mojibake examples on my blog) are just artifacts, but the real list of examples is the set of ftfy test cases at [Wayback/Archive.is] python-ftfy/test_cases.json at master · LuminosoInsight/python-ftfy

I got reminded when Waternet moved from paper mail using “Pyreneeën” to email using “Pyreneeën“. Not as bad as Waterschap AGV did earlier: they took it one level further and made “Pyreneeën” out of it, see Last year, a classic Mojibake was introduced when Waterschap Amstel, Gooi en Vecht redesigned their IT systems.

This seems like a trend where newer systems perform worse than older systems. I wonder why that is.

BTW: the trick on the [Wayback/Archive] Python.org shell to run ftfy (which is not installed by default) is first dropping to the shell (see my post How do I drop a bash shell from within Python? – Stack Overflow), then starting python again:

Read the rest of this entry »

Posted in CP850, Development, Encoding, ftfy, ISO-8859, Mojibake, Python, Scripting, Software Development, Unicode, UTF-8, UTF8 | Leave a Comment »

Get it at a discount while it is hot: Delphi Thread Safety Patterns eBook by Dalija Prasnikar and Neven Prasnikar Jr.

Posted by jpluimers on 2022/06/01

Get the new [Wayback/Archive] Delphi Thread Safety Patterns eBook at a discount while it is hot:

Use Coupon Code: DTSPATT10 at checkout to get a $10 discount.
This promotional offer is valid through June 14.

Read the rest of this entry »

Posted in Delphi, Development, Encoding, ISO-8859, ISO8859, Mojibake, Multi-Threading / Concurrency, Software Development, UTF-8, Windows-1252 | Leave a Comment »

Last year, a classic Mojibake was introduced when Waterschap Amstel, Gooi en Vecht redesigned their IT systems

Posted by jpluimers on 2022/03/16

Last year, Waterschap Amstel, Gooi en Vecht sent me a paper letter notifying the yearly water bill was going to be late as they were redesigning their IT systems.

Their letter introduced a classic Mojibake that had not been present in all their older paper letter communication.

  • Street name on a letter via the old IT systems is "Pyreneeën":

    Pyreneeën goed geprint.

  • Street name on a letter via the new IT systems is "Pyreneeën":

    Pyreneeën geprint met Mojibake vervormingen.

Read the rest of this entry »

Posted in Development, Encoding, ftfy, Mojibake, Python, Software Development, Unicode, UTF-8, UTF8 | Leave a Comment »

The things I didn’t notice during cancer survival: ftfy 6.0 and more versions got released during my recovery (including the poem “Ode to a Shipping Label”)

Posted by jpluimers on 2022/03/10

When writing this, [Wayback/Archive.is] ftfy · PyPI:history indicates ftfy was already at 6.0.3.

It is still my goto tool for figuring out the cause of Mojibake. I remember writing about it the first time in 2016 (see the ftfy category) when it was already at version 3.0, discovering it after a few Mojibake posts.

By now it even understands right-to-left Mojibake garbage: [Archive.is] Elia Robyn Speer on Twitter: “ftfy 5.8 is out! … A user reported that Hebrew text wasn’t being fixed, and this made me think about how to expand some of the trickier cases to non-Latin alphabets.”

Mojibake mishaps still happen a lot, so by now I hope I will have done a Mojibake themed Delphi talk at one or more conferences.

Read the rest of this entry »

Posted in !!con (bangbangcon), About, Autistic Spectrum/Autism, Cancer, Conference Topics, Conferences, Development, Encoding, Event, ftfy, Mojibake, Personal, Python, Rectum cancer, Scripting, Software Development, Unicode | Leave a Comment »