The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,860 other subscribers

Archive for the ‘Unicode’ Category

CodeRage 4: sessions recorded; Delphi 2010 migration was a beeze; samples/slides will be uploaded soon

Posted by jpluimers on 2009/09/05

I just finished recording my CodeRage 4 sessions:

  • Practical XML in Delphi
  • Reliable Communication between Applications with Delphi and ActiveMQ
  • Using Unicode and Other Encodings in your Programs

CodeRage 4 is a free, virtual conference on Embarcadero technologies with a lot of Delphi sessions.
It is held from September 8 till 11, 2009, i.e. next week :-)
If you want to watch sessions live, be sure to register through LiveMeeting (the technology they use for making this all happen).

This week, I found some time do migrate all the sample projects to the release versions of Delphi Win32 2010 and Delphi Prism 2010.

Delphi Win32 2010 works like a charm: it is much faster and has a much smaller footprint than any other Galileo based IDE.
In fact, it feels almost as fast as the pre-Galileo based IDE’s.
With the added benefit that all the new features make me much more productive, not the least because it has not yet crashed on me this week once.
Crashing has been a frequent thing on me since Delphi 4 (maybe I should not even mention that number ), for most IDE’s at least a couple of times a week, so this is good.

Delphi Prism 2010 works really nice too, it is rock solid, and the language as some great features not found in other .NET languages.
But it still needs a tiny bit more polishing on the Visual Studio IDE Integration part.
There are a few things not as smoothly integrated as I’m used to in C# and VB .NET (for instance when adding assembly references; C# and VB.NET allow you to do that from multiple places in the IDE; Delphi Prism from only one).
I know it is nitpicking (the same holds for the Team Foundation System integration in the Visual Studio IDE: ever tried to add files or folders? There is only one icon that allows you to do it. Ever tried to move files or folders around? No way you can drag & drop, in fact you can move only 1 file or folder at a time, and then the folder tree leaves you at the target).

The Embarcadero folks have worked hard on developer productivity in the Delphi Win32 2010 IDE.
(Did I mention the F6 key? It is an awesome way of directly jumping into configuration dialogs a zillion levels deep.
Did I mention the Ctrl-D key? It instantly reformats your source code to your formatting settings).
So maybe it is now time to put some of that effort into the Prism side as well.

Back to my CodeRage sessions: the recordings are done, they will soon become available as downloads together with the samples/slides.

Keep watching :-)

–jeroen

Posted in .NET, CommandLine, Database Development, Debugging, Delphi, Development, Encoding, Event, Firebird, InterBase, Java, Package Development, Prism, Software Development, Source Code Management, TFS (Team Foundation System), Unicode, Visual Studio and tools, XML, XML/XSD, XSD | Leave a Comment »

StUF – receiving data from a provider where UTF-8 is in fact ISO-8859

Posted by jpluimers on 2009/05/08

Recently when receiving information from a StUF webservice created by a large Dutch provider of government IT systems, we had an issue with characters having their high bit set.

Although the web-service pretended to send their information as UTF-8, in fact they were encoding using a form of ISO_8859.

The most likely character set they used is ISO-8859-1 (since that is the default encoding for the HTTP protocol), but it might also be ISO-8859-15 which is an adaption of ISO-8859-1 trading some typographic characters for the euro-sign and some characters from French and some characters used for transliteration of  Russian, Finnish and Estonian.
(note that the printable characters of both ISO-8859-1 and ISO-8859-15 can be displayed by the Windows-1252 code page)

Since it is not possible to reliably “guess” the right encoding (there are way to many possibilities, even IsTextUnicode that is used by Notepad fails, see below), the only way is to use a fixed reencoding that depends on the StUF data provider. Read the rest of this entry »

Posted in Development, Encoding, ISO-8859, ISO8859, Mojibake, Software Development, The Old New Thing, Unicode, UTF-8, UTF8, Windows Development, XML, XML/XSD | 5 Comments »

.NET/C# – converting UTF8 to ASCII (yes, you *can* loose information with this) using System.Text.Encoding

Posted by jpluimers on 2009/04/23

Quite a while ago, we needed to exchange some text files with .NET and a DOS application (yes, they still exist!).

Since the .NET app already exchanged text files with other applications using UTF8, we had to reencode those into plain ASCII.
(yes, I am aware there are dozens of codepages we could encode to, we decided to stick with 7-bit ASCII, and warned the client about possible information loss).

A couple of months later, we neede to exchange information with an app doing Windows-1252, and then even later on to a web-app needing ISO 8859-1 (both are Western European encodings).
So I decided to refactor the UTF8 to ASCII conversion app into something more maintainable.

But first let me show you how you can dump all of the .NET supported encodings:
Read the rest of this entry »

Posted in .NET, ASCII, C#, Development, Encoding, Software Development, Unicode, UTF-8, UTF8 | 3 Comments »