The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,862 other subscribers

Archive for the ‘Encoding’ Category

Delphi hinting directives: deprecated, experimental, library and platform

Posted by jpluimers on 2014/10/01

I’ve been experimenting with the Delphi hinting directives lately to make it easier to migrate some libraries to newer versions of Delphi and newer platforms.

Hinting directives (deprecated, experimental, library and platform) were – like the $MESSAGE directive – added to Delphi 6.

Up to Delphi 5 you didn’t have any means to declare code obsolete. You had to find clever ways around it.

Warnings for hinting directives

When referring to identifiers marked with a hinting directive, you can get various warning messages that depend on the kind of identifier: unit, or other symbol. Read the rest of this entry »

Posted in Apple Pascal, Borland Pascal, DEC Pascal, Delphi, Delphi 2005, Delphi 2006, Delphi 2007, Delphi 2009, Delphi 2010, Delphi 6, Delphi 7, Delphi 8, Delphi XE, Delphi XE2, Delphi XE3, Delphi XE4, Development, Encoding, FreePascal, ISO-8859, ISO8859, Java, Lazarus, MQ Message Queueing/Queuing, QC, Reflection, Software Development, Sybase, Unicode, UTF-8, UTF8 | 2 Comments »

Windows key character that displays on non-Windows systems (like Mac)

Posted by jpluimers on 2014/08/08

Though there is a Unicode character for the Apple Command Key, there is none for the Windows Key.

The Windows font WinDings does have a character 255 for it, but that font usually is not installed on non-Windows systems. There it will look like Unicode Character ‘LATIN SMALL LETTER Y WITH DIAERESIS’ (U+00FF)

This Unicode code point comes closest to the Windows key: Unicode Character ‘SQUARED PLUS’ (U+229E) and is used by Windows Key page on WikiPedia.

  • The WinDings character looks like this: ÿ
    (non no Windows systems, it will look like an y with two dots on it: ÿ)
  • The Unicode Codepoint U+229E like this: ⊞
    Not a complete match, but pretty close.

The Unicode code points for Mac modifier keys are these:

–jeroen

Posted in Development, Encoding, Mac, Mac OS X / OS X / MacOS, Mac OS X 10.4 Tiger, Mac OS X 10.5 Leopard, Mac OS X 10.6 Snow Leopard, Mac OS X 10.7 Lion, MacBook Retina, MacBook-Air, MacBook-Pro, OS X 10.8 Mountain Lion, Power User, Software Development, Unicode, Windows, Windows 7, Windows 8, Windows Server 2003, Windows Server 2003 R2, Windows Vista, Windows XP, Windows-1252 | Leave a Comment »

Recommended reads when dealing with Character Encodings in software

Posted by jpluimers on 2014/05/06

Apart from the mandatory Joel on Software article about Unicode and Character sets, these two articles are of great value too:

Fun to read from that blog is the Historical Technology  section including this article:

–jeroen

PS: The mandatory one is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) – Joel on Software.

 

Posted in .NET, Ansi, ASCII, CP437/OEM 437/PC-8, Delphi, Development, EBCDIC, Encoding, ISO-8859, ISO8859, Shift JIS, Software Development, Unicode, UTF-8, UTF8, Windows-1252 | Leave a Comment »

Fiddler2 to the max: inserting proxy authentication to use DropBox (or other app) behind a corporate firewall

Posted by jpluimers on 2014/04/16


A while ago, I was working with a not so cooperative corporate firewall. All web browsers would work fine, but most other applications would not go through the proxy in a nice way.

For instance, DropBox would show the dreadfull “Connection Error” dialog shown on the right.

That dialog basically means “Dropbox has no clue what happens, try fiddling with your proxy or account settings, then press Reconnect Now” to retry.

Many other applications had issues (for instance Visual Studio connecting to Team Foundation System was very unreliable and the workarounds clumsy).

CNTLM: not the solution

I got inspired by the [WayBack] I code and code: Tutorial: How to use Dropbox behind a corporate proxy server using CNTLM, even though I was pretty sure the corporate firewall was not NTLM based.

And indeed, CNTLM -v -M http://google.com -c CNTLM.INI would give errors like this:

cntlm: Proxy returning invalid challenge!
headers_send: fd 4 warning -999 (connection closed)
Connection closed

HTTP Fiddler: looks promising

So I fired up my old buddy [WayBack] Fiddler 2 HTTP debugging proxy.

Further on, you will learn that Fiddler2 is much more, but right now it is enough to know that it basically sits as a local proxy between your applications and the outside world. Read the rest of this entry »

Posted in .NET, .NET 2.0, .NET 3.0, .NET 3.5, .NET 4.0, .NET 4.5, base64, Cntlm, Development, DropBox, Encoding, Fiddler, JavaScript/ECMAScript, NTLM, Power User, Scripting, SocialMedia, Software Development, Web Development, Windows, Windows 7, Windows 8, Windows Server 2000, Windows Server 2003, Windows Server 2003 R2, Windows Server 2008, Windows Server 2008 R2, Windows Vista, Windows XP, Windows-Http-Proxy | Leave a Comment »

Cool post from Marc’s Blog: Delphi XE2’s hidden hints and warnings options

Posted by jpluimers on 2014/04/05

A while ago, I had to disable a couple of warnings from legacy code so I could first perform the Unicode conversion, then make time to eliminate the actual warning cause.

This post was much helpful here:

Marc’s Blog: Delphi XE2’s hidden hints and warnings options.

He lists all the W#### and X#### warnings he could find in Delphi XE2 (XE3, XE4 and XE5 more or less have the same), including the mapping to the equivalent directive IDs used inside these blocks:

{$WARN SYMBOL_DEPRECATED ON}
{$WARN SYMBOL_DEPRECATED OFF}
{$WARN SYMBOL_DEPRECATED DEFAULT}
{$WARN SYMBOL_DEPRECATED ERROR}

I also learned that the DEFAULT value restores an option to what you specified in the project settings.

–jeroen

Posted in Delphi, Delphi 2009, Delphi 2010, Delphi XE, Delphi XE2, Delphi XE3, Delphi XE4, Development, Encoding, Software Development, Unicode | 11 Comments »

Charset Detector :: Summary

Posted by jpluimers on 2014/03/31

In case I ever need it: [Wayback] Charset Detector :: Summary.

It is empirical (you cannot 100% reliably find out what character set / encoding a file is), but has a good score.

A similar problem is detecting the language. There too you can get a good score.

–jeroen

via:

Posted in .NET, C#, Delphi, Delphi 2009, Delphi 2010, Delphi XE, Delphi XE2, Delphi XE3, Delphi XE4, Delphi XE5, Development, Encoding, Software Development | Leave a Comment »

HTTP protocol requires you to escape spaces (usually with %20 or with +), but web-browsers will do that for you

Posted by jpluimers on 2014/02/20

Since the time that spaces are allowed in path and file names, it has caused confusion.

I personally like the readability of  spaces, but still tend to avoid them as they usually cause more harm than the readability gains.

An interesting thread about spaces in file names is operating systems – What technical reasons exist for not using space characters in file names? – Super User.

In URLs, you there are various kinds of places where spaces can be used. You have to escape as Xah Lee wonders in does HTTP protocol require space be encoded in file path?.

The escaping is part of the URL Encoding, but the escapes depends on the position of the space. In the query part (after the first ?), you can have it escaped by both %20 and plus sign, but in the path part (before the first ? sign), it can only have a %20.

This is explained by bobince in urlencode – when to encode space to plus (+) and when to %20? – Stack Overflow.

That escaping basically makes path and file names a lot less readable when passed as a URL. It causes posts like these:

But why can you still use spaces when you type a URL in your web browser, or use it in a href, src or other HTML URL attribute?

Xah Lee rightfully earlier wondered about that in webserver – space in url; did browser got smarter or server? – Stack Overflow.

Technically, both are not allowed. But web browser manufacturers understand we humans are lazy, and accommodate for that by encoding these when putting them into the HTTP request.

You can type “https://www.google.com/search?q=foo bar” in your web browser, and depending on the browser, it gets translated into either one of these:

Recap:

  • encode spaces in URLs as %20
  • try to avoid spaces in path and filenames

–jeroen

via:

Posted in Development, Encoding, HTML, Software Development, URL Encoding, Web Development | Leave a Comment »

Some Unicode links

Posted by jpluimers on 2013/12/30

I see a lot of programmers struggle with Unicode and think it is difficult as getting the encoding decoding hassle right can take quite a bit of effort. There is a lot of fun in using Unicode as well, as the number of code points (in laymen speak: characters) is huge and the Unicode code points are well organized into various planes (or blocks) with related code points. I like Charbase: A visual unicode database a lot especially as they have pictograms of all code points that always show a picture, even if you don’t have a font that your browser can use to display the character belonging to the code point. Here are a few links from to characters and blocks of characters in their database that I like a lot: Read the rest of this entry »

Posted in Development, Encoding, Software Development, Unicode, UTF-8, UTF8 | Leave a Comment »

▶ Characters, Symbols and the Unicode Miracle – Computerphile – YouTube

Posted by jpluimers on 2013/12/23

Brilliant:

UTF-8 explained in less than 9 minutes.

The diagram fits almost on the back of a napkin, so he explained it with a big marker on classic 132-column fan fold green-bar continuous form  paper (we used to call it zebra-paper).

I’m definitely going to follow the Compuerphile videos and watch more of them.

Definitely a great addition to my UTF-8 posting category.

Thanks for everyone that pointed me to this video!

–jeroen 

via: ▶ Characters, Symbols and the Unicode Miracle – Computerphile – YouTube.

Posted in Development, Encoding, Software Development, Unicode, UTF-8, UTF8 | Leave a Comment »

Delphi XE3/XE4: removing empty .VLB files; XE5 update 2 and special offers are out. #codingindelphi

Posted by jpluimers on 2013/12/19

Even when not using Visual Live Binding, Delphi generates empty .VLB files in both Delphi XE3 (virtually always) and Delphi XE4 (most of the time).

Visual Live Binding is one way of binding data to UI in FireMonkey and can also be used in VCL, but does not have to (Alister Christie made a nice video ▶ Delphi Training Tutorial #77 – Visual Live Bindings – YouTube about it).

Empty VLB files, and a batch file to delete them

The “empty” VLB files are almost empty, as they are exactly 3 bytes long and contain the byte sequence EF BB BF which is the Unicode BOM (byte order mark) for the UTF-8 encoding. Read the rest of this entry »

Posted in Delphi, Delphi XE3, Delphi XE4, Delphi XE5, Development, Encoding, QC, Software Development, Unicode, UTF-8, UTF8 | Tagged: , | Leave a Comment »