Yet another interesting post (I think it is via [Wayback/Archive] isotopp, but I cannot find the original message any more): [Wayback/Archive] Migration from Twitter to Mastodon | ads’ corner.
Archive for November 22nd, 2022
A while ago I bumped into some GPI Mojibake examples, but soon found out I should use the ftfy test cases
Posted by jpluimers on 2022/11/22
I have been into more and more Mojibake example pages like [Wayback] Mojibake: Question Marks, Strange Characters and Other Issues | GPI
Have you ever found strange characters like these ��� when viewing content in applications or websites in other languages?
They made me realise that all these (including the Mojibake examples on my blog) are just artifacts, but the real list of examples is the set of ftfy test cases at [Wayback/Archive.is] python-ftfy/test_cases.json at master · LuminosoInsight/python-ftfy
I got reminded when Waternet moved from paper mail using “Pyreneeën” to email using “Pyreneeën
“. Not as bad as Waterschap AGV did earlier: they took it one level further and made “Pyreneeën
” out of it, see Last year, a classic Mojibake was introduced when Waterschap Amstel, Gooi en Vecht redesigned their IT systems.
This seems like a trend where newer systems perform worse than older systems. I wonder why that is.
BTW: the trick on the [Wayback/Archive] Python.org shell to run ftfy
(which is not installed by default) is first dropping to the shell (see my post How do I drop a bash shell from within Python? – Stack Overflow), then starting python again:
Posted in CP850, Development, Encoding, ftfy, ISO-8859, Mojibake, Python, Scripting, Software Development, Unicode, UTF-8, UTF8 | Leave a Comment »