The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,862 other subscribers

Archive for the ‘internatiolanization (i18n) and localization (l10)’ Category

How can I get the default code page for a locale? – The Old New Thing

Posted by jpluimers on 2017/06/20

Ask GetLocaleInfo (example function GetAnsiCodePageForLocale included): [WayBack] How can I get the default code page for a locale? – The Old New Thing

UINT GetAnsiCodePageForLocale(LCID lcid)
{
  UINT acp;
  int sizeInChars = sizeof(acp) / sizeof(TCHAR);
  if (GetLocaleInfo(lcid,
                    LOCALE_IDEFAULTANSICODEPAGE |
                    LOCALE_RETURN_NUMBER,
                    reinterpret_cast<LPTSTR>(&acp),
                    sizeInChars) != sizeInChars) {
    // Oops - something went wrong
  }
  return acp;
}

And even though you didn’t ask, you can use LOCALE_IDEFAULT­CODE­PAGE to get the OEM code page for a locale.

Bonus gotcha: There are a number of locales that are Unicode-only. If you ask the Get­Locale­Info function and ask for their ANSI and OEM code pages, the answer is “Um, I don’t have one.” (You get zero back.)

Related:

–jeroen

Posted in Development, Encoding, internatiolanization (i18n) and localization (l10), Software Development, The Old New Thing, Windows Development, Windows-1252 | 2 Comments »

Where can I get the glossary of Microsoft’s standard translations for computer terms? – The Old New Thing

Posted by jpluimers on 2016/12/22

A while ago I bumped into [WayBack] Where can I get the glossary of Microsoft’s standard translations for computer terms? – The Old New Thing

Since I’m a non-digital pack-rat as well, I love [WayBack] this comment by [WayBack] Ian Boyd:

We have an *old* copy of the Microsoft Style Guide – an actual book. From that book i’ll always remember that e-mail has a hyphen in it.

I’ve that book too and write e-mail the same way.

But books are often hard to search through, so I love this list that [WayBack] Raymond Chen made:

I especially like the [WayBack] interactive search, but with any outcome, please remember that the context of your translation is very important.

For instance, I vividly remember a project some 20+ years ago where we had to translate the words “Close” and “Cancel” in the realm of the insurance business.

All guides indicated “Close” should become “Sluiten” which in that realm is colloquial for “Afsluiten” which means “to take out an insurance” implying a totally wrong action. Similarly “Cancel” translated to “Annuleren” which in the same realm would mean “to cancel an insurance”.

So we went for very specific translations narrowing down what exactly would happen in those screens, like:

  • “Verlaten” (English “Exit”)
  • “Bewaren” or “Opslaan” (English “Save”)
  • “Terug” (English “Back”)

–jeroen

Posted in Development, internatiolanization (i18n) and localization (l10), Software Development, The Old New Thing, Windows Development | Leave a Comment »

Translation Memory Tools Tried and Found Wanting – Oli’s Blog

Posted by jpluimers on 2016/02/11

Thanks Oliver for sumarising this: Translation Memory Tools Tried and Found Wanting – Oli’s Blog

His conclusion support why I see all my clients building their own translation tooling: no 3rd party tool really supports the full process well, especially not the translation memory parts.

–jeroen

Posted in Development, internatiolanization (i18n) and localization (l10), Software Development | Leave a Comment »

Jon Skeet’s speech “Back to basics” is really a good watch – via Jørn Einar Angeltveit G+

Posted by jpluimers on 2015/07/15

Thanks [Wayback] Jørn Einar Angeltveit for sharing this a while ago:

A session by Jon Skeet and Tony the Pony (which has strong teeth) presented during the Polish DevDay 2013 in Kraków, Poland.

[Wayback] +Jon Skeet’s speech [Wayback] “Back to basics” is really a good watch.

In a funny way, he explains why the simplest fundamentals of computer software text, dates and numbers can cause some real headache for the programmer…

In case you didn’t know: Jon Skeet is “Chuck Norris” on [Wayback] stackoverflow.com:

The subtitle is “the mess we’ve made of our fundamental data types”.

Some of the topics covered:

Read the rest of this entry »

Posted in .NET, C#, Conference Topics, Conferences, Delphi, Development, Encoding, Event, internatiolanization (i18n) and localization (l10), Java, Java Platform, Jon Skeet, Pascal, Scripting, Software Development, Unicode | 2 Comments »

Michael Kaplan’s Sorting it All Out blog is back! http:///www.siao2.com (via Tim’s comment)

Posted by jpluimers on 2014/08/14

A while ago, Tim mentioned that [WayBack] Michael Kaplan’s blog “Sorting it All Out” on MSDN was gone.

I amended my original post because of it (see below), and I’m really happy that Tim kept track of his comment, and just posted a new comment:

Michael Kaplan’s Sorting it All Out blog is back! [WayBack] http:///www.siao2.com

Back to the original edit I made as the new blog doesn’t (yet?) has all the content of the old blog:

Edit: Michael’s MSDN blog is officially dead, but there are the nice web archive and web cache virtues:

Michael also appeared on this 30 minute podcast episode: [WayBack] Hanselminutes Technology Podcast – Fresh Air and Fresh Perspectives for Developers – Sorting out Internationalization with Michael Kaplan

Michael Kaplan is a Developer in the Windows International group and the author of the popular ‘Sorting It Out’ blog that is dedicated it all things ‘-ization.’ That means Globalization, Internationalization, and Localization. This show is is brought to you by the CYRILLIC CAPITAL LETTER A.

Some key points:

  • Use these languages for UI testing
    • English as it is common and slightly wordy
    • German because it is
      • more wordy (30-50% more than English) to test for clipping text, and used enough to warrant the energy
    • Turkish because of the Turkish i
    • Arabic (is right-to-left, cursive and has ligatures) or Hebrew (which is just right-to-left and cursive)
    • Thai because it has plenty of word-breaking issues and tests Uniscribe well
  • Push UTF-8 all the way through your system and back and avoid question marks and other

After that: time to catch up on Michael’s new blog (:

–jeroen

via: Delphi: a few short notes on LoadString and loading shell resource strings for specific LCIDs

Posted in Development, internatiolanization (i18n) and localization (l10), Software Development, User Experience (ux) | Leave a Comment »

Delphi: a few short notes on LoadString and loading shell resource strings for specific LCIDs

Posted by jpluimers on 2014/07/17

I’m not a real expert on LCID (the values like 1033 (aka 0x409 or $409) and 1043 (aka 0x413 or $413), but here are a few notes on stuff that I wrote a while ago to obtain shell32.dll resource strings for various LCIDs.

The most often used way to load resource strings is by calling the LoadString Windows API call which loads the string for the currently defined LCID.

To get them for a different LCID, there are two ways:

  1. Set the LCID for the current thread (don’t forget to notify the Delphi RTL you did this, and update FormatSettings)
  2. Write an alternative for LoadString that gets a string for a specific LCID (so you can keep the current thread in a different LCID)

The first method – altering the LCID of the current thread – is done using SetThreadLocale in Windows XP or earlier, and SetThreadUILanguage in Windows Vista/2008 and up (I’m not sure on the timeline of Windows Server versions, but I guess the split is between 2003 and 2008) as mentioned at SetThreadLocale and SetThreadUILanguage for Localization on Windows XP and Vista | words.

SetThreadLocale is deprecated, as Windows has started switching from LCID to Locale Names. This can cause odd behaviour in at least Delphi versions 2010, XE and XE2. See the answers at delphi – GetThreadLocale returns different value than GetUserDefaultLCID? for more information.

But even on XP it has the potential drawback of selecting language ID 0 (LANG_NEUTRAL) which selects the English language if it is available (as that is in the default search order). Both Writing Win32 Multilingual User Interface Applications and the answers to LoadString works only if I don’t have an English string table and Windows skipping language-specific resources and the Embarcadero Discussion Forums: How to load specific locale settingsd thread that describe this behaviour.

To work around that, you can do two things: store your resource strings in locale dependent DLLs, or (if you don’t write those DLLs yourself), write an alternative for LoadString.

I’ve done the latter for Delphi, so I could load strings for a specific LCID from the Shell32.dll.

For a full overview of all these strings, see http://www.angelfire.com/space/ceedee/shell32stringtables.txt

A few pieces of code.

You can get the full code at the BeSharp – Source Code Changeset 100520 (now at bitbucket too). Read the rest of this entry »

Posted in Delphi, Delphi XE2, Delphi XE3, Delphi XE4, Development, internatiolanization (i18n) and localization (l10), Software Development | 4 Comments »

Link clearance: fonts, localization, languages, internationalization, PostScript, and more

Posted by jpluimers on 2013/03/01

A few links I came across recently:

–jeroen

Posted in About, Development, Encoding, EPS/PostScript, Font, internatiolanization (i18n) and localization (l10), Personal, Power User, Programmers Font, Software Development, Unicode | Leave a Comment »

Excel XML Spreadsheet: Styles and formatting

Posted by jpluimers on 2011/08/25

I found some time to continue my series that started with Excel XML Spreadsheet: Date.Type is mandatory :)

This time, it is about Styles and using the styles to format. I’ll limit myself to formatting Columns, but you can equally apply this to individual Cells, Rows, and Tables.

Note that in the below XML listings, I have replaced the angle brackets with { and }, because the WordPress editor will otherwise delete the XML from the sourcecode portions.

First, let’s look at some ss:Styles:

 {Styles}
  {Style ss:ID="Default" ss:Name="Normal"}
   {Alignment ss:Vertical="Bottom"/}
  {/Style}
  {Style ss:ID="s21"}
   {NumberFormat ss:Format="yyyy/mm/dd"/}
  {/Style}
  {Style ss:ID="s22"}
   {NumberFormat ss:Format="yyyy/mm/dd\ h:mm:ss"/}
  {/Style}
  {Style ss:ID="s31"}
   {NumberFormat ss:Format="[ENG][$-409]ddd"/}
  {/Style}
  {Style ss:ID="s32"}
   {NumberFormat ss:Format="[$-F800]dddd\,\ mmmm\ dd\,\ yyyy"/}
  {/Style}
 {/Styles}

Then the usage of the styles in Columns:

   {Column ss:StyleID="s21" ss:Width="53.25"/}
   {Column ss:Index="4" ss:StyleID="s31" ss:Width="89.25"/}
   {Column ss:StyleID="s22" ss:Width="95.25"/}
   {Column ss:StyleID="s32" ss:Width="95.25"/}

First a few remarks about the ss:Styles:

  1. Styles have IDs, which don’t need to be in the form s##, you can use any unique ID for them. Excel uses s## because that’s how the formatting pick-list works.
  2. You specify the formatting as a ss:NumberFormat using the components from the Creating international number formats documentation.
  3. You can add an Excel specific LCID (locale identifier) to a format. Without it, it will use the user’s locale settings.
  4. You can ommit the language hint (like [ENG]) from the formatting.
  5. The Excel LCID is very similar to the LCID Structure using hexadecimal values from the (old now defunct Locale ID Chart and replaced by the new) Microsoft Locale ID Values,  Language Identifier Constants and Strings table or list of Locale IDs Assigned by Microsoft, but with a few twists.
  6. There is a lot of confusion about [$-F800] and [$-F400] which actually behaves as LANG_SYSTEM_DEFAULT (0x0800 in the latter table), where [$-F800] displays the long date and [$-F400] displays the time (as correctly identified in this Openoffice Bugzilla bug report – or the Google cache of it).
  7. Three digit language [$-409] should be extended to 4-digit LCID 0x0409. It will format the cell using that specific language (in this case: English 3-letter weekday abbreviation).

A few remarks about the ss:Columns:

  1. These columns define formatting for column A, D, E and F.
  2. You don’t need to have a definition for every column in your Worksheet.Table, just for the ones that need formatting.
  3. The Column definition is smart: it can be sparse! After each gap, define a column having an ss:Index attribute, then continue defining subsequent columns until you need another gap.
  4. You can ommit the ss:Width attribute: when empty, the column will auto-size

–jeroen

PS: Rob van Gelder posted a nice formula to show nice translations using Excel formatting.

Posted in Development, internatiolanization (i18n) and localization (l10), Software Development, XML, XML/XSD, XSD | Leave a Comment »