I have been into more and more Mojibake example pages like [Wayback] Mojibake: Question Marks, Strange Characters and Other Issues | GPI
Have you ever found strange characters like these ��� when viewing content in applications or websites in other languages?
They made me realise that all these (including the Mojibake examples on my blog) are just artifacts, but the real list of examples is the set of ftfy test cases at [Wayback/Archive.is] python-ftfy/test_cases.json at master · LuminosoInsight/python-ftfy
I got reminded when Waternet moved from paper mail using “Pyreneeën” to email using “Pyreneeën
“. Not as bad as Waterschap AGV did earlier: they took it one level further and made “Pyreneeën
” out of it, see Last year, a classic Mojibake was introduced when Waterschap Amstel, Gooi en Vecht redesigned their IT systems.
This seems like a trend where newer systems perform worse than older systems. I wonder why that is.
BTW: the trick on the [Wayback/Archive] Python.org shell to run ftfy
(which is not installed by default) is first dropping to the shell (see my post How do I drop a bash shell from within Python? – Stack Overflow), then starting python again: