The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 4,230 other subscribers

When someone writes UTF-8 and UTF-16 strings to the same file in binary format without converting between them…

Posted by jpluimers on 2017/06/21

A while ago, I had to fix some stuff in an application that would write – using a binary mechanism – UTF-8 and UTF-16 strings (part of it XML in various flavours)  to the same byte stream without converting between the two encodings.

Some links that helped me investigate what was wrong, choose what encoding to use for storage and fix it:

–jeroen

3 Responses to “When someone writes UTF-8 and UTF-16 strings to the same file in binary format without converting between them…”

  1. UTF-16LE or UTF-16BE? ;) I hope you choose to store every text with UTF-8 encoding in your binary stream.

    • KMorwath said

      Or at least you have a BOM or other metadata to tell readers what to expect… but I see very few cases when mixing encodings makes sense.

    • jpluimers said

      Because other tooling involved, I cannot add a BOM, but it’s at least UTF-8 output now.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
%d bloggers like this: