Archive for the ‘Unicode’ Category
Posted by jpluimers on 2013/03/29
Als je postcode “1060 NP” invult bij aansluiting en “1170 AB” bij facturatie, dan krijg je deze onterechte foutmeldingen:
- Organisatie postcode is ongeldig
- Facturatie gegevens postcode is ongeldig
Beetje vreemd, want al sinds de introductie van postcodes in Nederland in 1978 zit in een postcode 1 spatie, en tussen de postcode en de woonplaats 2 spaties.
Ook trema‘s gaan mis: bij postcode 1060 NP hoort de straat Pyreneeën in Amsterdam, maar bij Heldenvan.nu wordt het deze Mojibake:
Read the rest of this entry »
Posted in Development, Encoding, Mojibake, Opinions, Power User, Software Development, Unicode | Leave a Comment »
Posted by jpluimers on 2013/03/14
Variant Records are a feature that has been in the Pascal language since Standard Pascal.
A cool page for historic perspective is R3R: Pascal Features in Popular Compilers, hopefully someone will update it to more modern versions of the mentioned compilers.
There is not much official documentation on the Delphi side on this, so below some parts of a case I used for a project that started in 1997 and is still in use to day. Read the rest of this entry »
Posted in APPC, AS/400 / iSeries / System i, ASCII, COBOL, Communications Development, Conference Topics, Conferences, CPI-C, Delphi, Delphi 1, Delphi 2005, Delphi 2006, Delphi 2007, Delphi 2009, Delphi 2010, Delphi 3, Delphi 4, Delphi 5, Delphi 6, Delphi 7, Delphi 8, Delphi XE, Delphi XE2, Delphi XE3, Development, Encoding, Event, HIS Host Integration Services, Internet protocol suite, MQ Message Queueing/Queuing, SNA, Software Development, TCP, Unicode, UTF-8, WebSphere MQ | 9 Comments »
Posted by jpluimers on 2013/03/12
Few people know about a Delphi language feature that has been present since Delphi 1: prepending the type definition with a type keyword to make the type getting a new identity.
Each time I use it, I have to do some browsing for the consequences, and this time I wrote down some notes and created a small example program (source is also below).
This time I needed it when writing class wrappers on top of the Delphi bindings for WebSphere MQ.
WebSphere MQ has Queues where you can put and get messages. It also has Queue Managers to which you connect, and that provide queuing services and manages queues.
Both Queues and Queue Managers have names that can be up to 48 (single byte) characters long.
Those names mean totally different things, so though the have similar data types, they have a different identity.
The same holds for 20 byte character arrays (they can be used as names for ChannelName, ShortConnectionName and MCAName). The 264 byte character array is so far used for ConnectionName only.
Distinguishing those types: That’s what “type types” in Delphi are all about. Read the rest of this entry »
Posted in CP437/OEM 437/PC-8, Delphi, Delphi 1, Delphi 2005, Delphi 2006, Delphi 2007, Delphi 2009, Delphi 2010, Delphi 3, Delphi 4, Delphi 5, Delphi 6, Delphi 7, Delphi 8, Delphi x64, Delphi XE, Delphi XE2, Delphi XE3, Development, Encoding, Shift JIS, Software Development, Unicode, UTF-8, UTF8 | 1 Comment »
Posted by jpluimers on 2013/03/01
A few links I came across recently:
–jeroen
Posted in About, Development, Encoding, EPS/PostScript, Font, internatiolanization (i18n) and localization (l10), Personal, Power User, Programmers Font, Software Development, Unicode | Leave a Comment »
Posted by jpluimers on 2013/01/29
A while ago, I had to adapt a DOS app that used one specific version of Excel to do some batch processing so it would support multiple versions of Excel on multiple versions of Windows.
One of the big drawbacks of DOS applications is that the command lines you can use are even shorter than Windows applications, which depending you how you call an application are:
This is how the DOS app written in Clipper (those were the days, it was even linked with Blinker :) started Excel:
c:\progra~1\micros~2\office11\excel.exe parameters
01234567890123456789012345678901234567890
1 2 3 4
The above depends on 8.3 short file names that in turn depend on the order in which similar named files and directories have been created.
The trick around this, and around different locations/versions of an application, is to use START to find the right version of Excel.
The reason it works is because in addition to PATH, it checks the App Paths portions in the registry in this order to find an executable: Read the rest of this entry »
Posted in Batch-Files, Development, Encoding, Power User, Scripting, Software Development, Unicode, Windows, Windows 7, Windows 8, Windows Server 2000, Windows Server 2003, Windows Server 2003 R2, Windows Server 2008, Windows Server 2008 R2, Windows Vista, Windows XP | Leave a Comment »
Posted by jpluimers on 2013/01/08
When developing in multiple languages, it sometimes is funny to see how they differ in compiler oddities.
Below are a few on const examples.
Basically, in C# you cannot go from a char const to a string const, and chars are a special kind of int.
In Delphi you cannot go from a string to a char. Read the rest of this entry »
Posted in .NET, ASCII, C#, C# 1.0, C# 2.0, C# 3.0, C# 4.0, C# 5.0, Delphi, Delphi 2009, Delphi 2010, Delphi XE, Delphi XE2, Development, Encoding, Software Development, Unicode | Leave a Comment »
Posted by jpluimers on 2012/11/20
A while ago I had a “duh” moment while calling a method that had many overloads, and one of the overloads was using int, not the char I’d expect.
The result was that a default value for that char was used, and my parameter was interpreted as a (very small) buffer size. I only found out something went wrong when writing unit tests around my code.
The culprit is this C# char feature (other implicit type conversions nicely summarized by Muhammad Javed):
A char can be implicitly converted to ushort, int, uint, long, ulong, float, double, or decimal. However, there are no implicit conversions from other types to the char type.
Switching between various development environments, I totally forgot this is the case in languages based on C and Java ancestry. But not in VB and Delphi ancestry (C/C++ do numeric promotions of char to int and Java widens 2-byte char to 4-byte int; Delphi and VB.net don’t).
I’m not the only one who was confused, so Eric Lippert wrote a nice blog post on it in 2009: Why does char convert implicitly to ushort but not vice versa? – Fabulous Adventures In Coding – Site Home – MSDN Blogs.
Basically, it is the C ancestry: a char is an integral type always known to contain an integer value representing a Unicode character. The opposite is not true: an integer type is not always representing a Unicode character.
Lesson learned: if you have a large number of overloads (either writing them or using them) watch for mixing char and int parameters.
Note that overload resolution can be diffucult enough (C# 3 had breaking changes and C# 4 had breaking changes too, and those are only for C#), so don’t make it more difficult than it should be (:
Below a few examples in C# and VB and their IL disassemblies to illustrate their differnces based on asterisk (*) and space ( ) that also show that not all implicits are created equal: Decimal is done at run-time, the rest at compile time.
Note that the order of the methods is alphabetic, but the calls are in order of the type and size of the numeric types (integral types, then floating point types, then decimal).
A few interesting observations:
- The C# compiler implicitly converts char with all calls except for decimal, where an implicit conversion at run time is used:
L_004c: call valuetype [mscorlib]System.Decimal [mscorlib]System.Decimal::op_Implicit(char)
L_0051: call void CharIntCompatibilityCSharp.Program::writeLineDecimal(valuetype [mscorlib]System.Decimal)
- Same for implicit conversion of byte to the other types, though here the C# and VB.NET compilers generate slightly different code for run-time conversion.
C# uses an implicit conversion:
L_00af: ldloc.1
L_00b0: call valuetype [mscorlib]System.Decimal [mscorlib]System.Decimal::op_Implicit(uint8)
L_00b5: call void CharIntCompatibilityCSharp.Program::writeLineDecimal(valuetype [mscorlib]System.Decimal)
VB.NET calls a constructor:
L_006e: ldloc.1
L_006f: newobj instance void [mscorlib]System.Decimal::.ctor(int32)
L_0075: call void CharIntCompatibilityVB.Program::writeLineDecimal(valuetype [mscorlib]System.Decimal)
Here is the example code: Read the rest of this entry »
Posted in .NET, Agile, Algorithms, C#, C# 1.0, C# 2.0, C# 3.0, C# 4.0, C# 5.0, C++, Delphi, Development, Encoding, Floating point handling, Java, Software Development, Unicode, Unit Testing, VB.NET | 1 Comment »
Posted by jpluimers on 2012/07/26
While reviewing some client’s code, I noticed they were generating and parsing XML and HTML by hand (do not ever do that yourself!).
Before refactoring this into something that uses libraries that properly understand XML and HTML, I needed to assess some of the risks.
A major risk is to get the escaping (and unescaping) of XML and HTML right.
Time to finally organize some of my links on escaping HTML and XML that I had in my favourites list.
The starting point is the List of XML and HTML character entity references on Wikipedia. It is readable, complete and lists both kinds of escapes.
XML escapes
The official W3C text that describes XML escaping is hard to read.
There are only 5 predefined XML entities for characters that can (some must) be escaped. This table is derived from the Wikipedia article.
HTML escapes
Read the rest of this entry »
Posted in " quot, & amp, > gt, < lt, ' apos, ASCII, Development, Encoding, HTML, Power User, SocialMedia, Software Development, Unicode, Web Development, WordPress, XML, XML escapes, XML/XSD | 1 Comment »
Posted by jpluimers on 2012/05/31
Brilliant: site where you can Search UTF8 Unicode Characters.
If you know part of the name of a Unicode character, you can now try and find it, then copy/paste it from that site.
Edit: 20160403: that site disappeared, but this one works: Unicode Character Search and Shapecatcher.com: Unicode Character Recognition still works.
–jeroen
Posted in Development, Encoding, Power User, Software Development, Unicode, UTF-8, UTF8 | Leave a Comment »
Posted by jpluimers on 2012/04/24
While transitioning from SQL Server 2000 to 2008, I recently had the “A severe error occurred on the current command. The results, if any, should be discarded.” occurring on SQL Server 2000 in the form as shown at the bottom of this message.
Many of the search results point you into the area of atabase corruption, or in using NVARCAR parameters with SQL Server 2000 or SQL Server 2005 (the app didn’t use NVARCAR, nor did it use large VARCHAR parameters).
The cool thing on the SQL Server Forums – System.Data.SqlClient.SqlException: A severe error occurred on the current command post was that it summed up causes, and asked for more:
Posted – 06/17/2004 : 15:05:20
Rashid writes “Hi: Gurus I am getting these errors when I try to execute my application. According to MS knowledge base (http://support.microsoft.com/default.aspx?scid=kb;en-us;827366) these errors happen due to following resons
- You use a SqlClient class in a Finalize method or in a C# destructor.
- You do not specify an explicit SQLDbType enumeration when you create a SqlParameter object. When you do not specify an explicit SQLDbType, the Microsoft .NET Framework Data Provider for SQL Server (SqlClient) tries to select the correct SQLDbType based on the data that is passed. SqlClient is not successful.
- The size of the parameter that you explicitly specify in the .NET Framework code is more than the maximum size that you can use for the data type in Microsoft SQL Server.
None of these are true in my case. Are there any other reasons that can cause these problems..
There is one more: sending huge SQL Statements to your SQL Server is always a bad idea and gives this error too. Read the rest of this entry »
Posted in .NET, C#, C# 2.0, C# 3.0, C# 4.0, Database Development, Development, Encoding, Software Development, SQL Server, SQL Server 2000, SQL Server 2008 R2, Unicode | Leave a Comment »