The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 4,225 other subscribers

installing the UTF-8 encoding ftfy (fixes text for you) – via version 3.0 | Luminoso Blog

Posted by jpluimers on 2016/09/06

Simple if you know it:

pip install ftfy

That installs it as a command which is a lot easier than using it from Github at [Waybackhttps://github.com/LuminosoInsight/python-ftfy

It knows how to solve the encoding issues in [Archive.is]  the future of publishing at W3C explaining about WTF-8 and Unicode history.

It didn’t solve my non-Unicode encoding issue: [Wayback] “v3/43/4r” -> “v¾¾r” -> “vóór”.

That was caused by an infamous Western Latin character set confusion issue, in this case ISO-8859/Windows- versus CP850/CP858 encoding issue (so: no Unicode involved at all, nor CP437 as it doesn’t have ¾).

So [Wayback] I put in a suggestion for ftfy to support finding the above.

PS (20220424): I found back the offending document at [Wayback] g428-1.pdf

–jeroen

via

PS: these manglings are called Mojibake

4 Responses to “installing the UTF-8 encoding ftfy (fixes text for you) – via version 3.0 | Luminoso Blog”

  1. ruurd said

    Nah. Just use utf-8 instemde of them funky charsets from last century. Stupid Windows…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
%d bloggers like this: