All categories

May 2024
M	T	W	T	F	S	S
	1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Archive for the ‘Unicode’ Category

I learned: MacOS has a Unicode Hex Input keyboard

Posted by jpluimers on 2023/05/25

A while ago, I learned that MacOS has had a Unicode Hex Input keyboard since ages.

It is not installed by default, so you have to manually add it:

Start the System Preferences.app
Open the Keyboard icon
Choose the Input Sources tab
Click the plus (+) icon
Search for Unicode or Hex to get so Unicode Hex Input is the only entry in the list
Click the Add button
Choose the Keyboard tab
Enable Show keyboard and emoji viewers in menu bar

Now in the menu bar, you can select the Unicode Hex Input.

After that, when holding the Option key, any 4-digit Unicode sequence will get you a Unicode character.

Read the rest of this entry »

Posted in Apple, Development, Encoding, Mac OS X / OS X / MacOS, Power User, Software Development, Unicode | Leave a Comment »

Berlin Typography on Twitter: “The best of #TypeInBerlin: The tʒ and ſʒ ligatures, together at last.” / Güntʒelstraſʒe == Güntzelstraße

Posted by jpluimers on 2023/04/17

Learned a new thing a while ago: I knew about the ſʒ ligature (that nowadays usually is written as ß), but the tʒ ligature was new to me.

So: Güntʒelstraſʒe == Güntzelstraße.

References:

[Wayback/Archive] What Unicode character is this ?
tʒ is a combination of these Unicode code points:
- t [Wayback/Archive] U+0074 : LATIN SMALL LETTER T
- ʒ [Wayback/Archive] U+0292 : LATIN SMALL LETTER EZH {dram}
ſʒ is a combination of these Unicode code points:
- ſ [Wayback/Archive] U+017F : LATIN SMALL LETTER LONG S
- ʒ [Wayback/Archive] U+0292 : LATIN SMALL LETTER EZH {dram}
Ligature (writing): Stylistic ligatures – Wikipedia
ß: Origin and development – Wikipedia
[Wayback/Archive] tʒ – Google Search
- [Wayback/Archive] I found this on an old legal document in Germany, any idea what fonts have these kinds of ligatures for tʒ and ch/ck? : identifythisfont
[Wayback/Archive] ſʒ – Google Search

Source: [Archive.is] Berlin Typography on Twitter: “The best of #TypeInBerlin: The tʒ and ſʒ ligatures, together at last. …” / Twitter

Read the rest of this entry »

Posted in Development, Encoding, LifeHacker, Power User, Software Development, Unicode | Leave a Comment »

A while ago I bumped into some GPI Mojibake examples, but soon found out I should use the ftfy test cases

Posted by jpluimers on 2022/11/22

I have been into more and more Mojibake example pages like [Wayback] Mojibake: Question Marks, Strange Characters and Other Issues | GPI

Have you ever found strange characters like these �� when viewing content in applications or websites in other languages?

They made me realise that all these (including the Mojibake examples on my blog) are just artifacts, but the real list of examples is the set of ftfy test cases at [Wayback/Archive.is] python-ftfy/test_cases.json at master · LuminosoInsight/python-ftfy

I got reminded when Waternet moved from paper mail using “Pyreneeën” to email using “PyreneeÃ«n“. Not as bad as Waterschap AGV did earlier: they took it one level further and made “PyreneeÃÂ«n” out of it, see Last year, a classic Mojibake was introduced when Waterschap Amstel, Gooi en Vecht redesigned their IT systems.

This seems like a trend where newer systems perform worse than older systems. I wonder why that is.

BTW: the trick on the [Wayback/Archive] Python.org shell to run ftfy (which is not installed by default) is first dropping to the shell (see my post How do I drop a bash shell from within Python? – Stack Overflow), then starting python again:

Read the rest of this entry »

Posted in CP850, Development, Encoding, ftfy, ISO-8859, Mojibake, Python, Scripting, Software Development, Unicode, UTF-8, UTF8 | Leave a Comment »

Unicode symbols in a batch file – Stack Overflow

Posted by jpluimers on 2022/06/30

Even with a batch file saved as UTF-8 (with or without BOM), by default it does not show most non-ASCII Unicode characters.

The reason is that the default codepage usually is an ANSI one like codepage 437.

Thanks [Wayback] niutech for answering [Wayback/Archive.is] Unicode symbols in a batch file – Stack Overflow:

You can manually set the codepage to UTF-8 by typing chcp 65001 at the top of your batch file.

Codepage 65001 is Windows speak for the UTF-8 code page. I have some more blog entries mentioning codepage 65001.

An example where I needed this was to show how to address the localghost from a batch file (see The spookback localghost address to resolve 👻). This was the resulting UTF-8 saved batch file:

chcp 65001
ping 👻
ping xn--9q8h

For single-byte non-ASCII characters, you can usually get away with setting the encoding of your batch file to your default code page as mentioned in [Wayback/Archive.is] cmd – Using box-drawing Unicode characters in batch files – Stack Overflow.

–jeroen

Posted in Batch-Files, Development, Encoding, Scripting, Software Development, Unicode, UTF-8, Windows Development | Leave a Comment »

Last year, a classic Mojibake was introduced when Waterschap Amstel, Gooi en Vecht redesigned their IT systems

Posted by jpluimers on 2022/03/16

Last year, Waterschap Amstel, Gooi en Vecht sent me a paper letter notifying the yearly water bill was going to be late as they were redesigning their IT systems.

Their letter introduced a classic Mojibake that had not been present in all their older paper letter communication.

Street name on a letter via the old IT systems is "Pyreneeën":
Street name on a letter via the new IT systems is "PyreneeÃÂ«n":

Read the rest of this entry »

Posted in Development, Encoding, ftfy, Mojibake, Python, Software Development, Unicode, UTF-8, UTF8 | Leave a Comment »

« Previous Entries

	jpluimers on Ookla speedtest CLI for Window…
	Mateusz on Now that XE8 is out, some Turb…
	jpluimers on Some links that might help use…
	jpluimers on Hidden Features in Delphi rela…
	jpluimers on Watching “Why is C# Evol…

The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

Subscribe

Archives

Recent Comments

Recent Posts

Blog Stats

Meta title

Tag Cloud Title

Top Clicks

Top Posts

My badges

Twitter Updates

My Flickr Stream

Pages

All categories

Email Subscription

Archive for the ‘Unicode’ Category

I learned: MacOS has a Unicode Hex Input keyboard

Berlin Typography on Twitter: “The best of #TypeInBerlin: The tʒ and ſʒ ligatures, together at last.” / Güntʒelstraſʒe == Güntzelstraße

A while ago I bumped into some GPI Mojibake examples, but soon found out I should use the ftfy test cases

Unicode symbols in a batch file – Stack Overflow

Last year, a classic Mojibake was introduced when Waterschap Amstel, Gooi en Vecht redesigned their IT systems

The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

Subscribe

Archives

Recent Comments

Recent Posts

Blog Stats

Meta title

Tag Cloud Title

Top Clicks

Top Posts

My badges

Twitter Updates

My Flickr Stream

Pages

All categories

Email Subscription

Archive for the ‘Unicode’ Category

I learned: MacOS has a Unicode Hex Input keyboard

Rate this:

Share this:

Berlin Typography on Twitter: “The best of #TypeInBerlin: The tʒ and ſʒ ligatures, together at last.” / Güntʒelstraſʒe == Güntzelstraße

Rate this:

Share this:

A while ago I bumped into some GPI Mojibake examples, but soon found out I should use the ftfy test cases

Rate this:

Share this:

Unicode symbols in a batch file – Stack Overflow

Rate this:

Share this:

Last year, a classic Mojibake was introduced when Waterschap Amstel, Gooi en Vecht redesigned their IT systems

Rate this:

Share this: