The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,862 other subscribers

Archive for the ‘Encoding’ Category

Plastic SCM command-line for merge and diff

Posted by jpluimers on 2018/09/03

Just in case I have Plastic SCM without Beyond Compare:

Merge

"C:\Program Files\PlasticSCM5\client\mergetool" -b="%TEMP%\baseFile-guid.pas" -bn="baseSymbolicName" -bh="baseHash" -s="%TEMP%\sourceFile-guid.pas" -sn="srcSymbolicName" -sh="srcHash" -d="...\destinationPath\destinationFile.pas" -dh="destinationHash" -a -r="%TEMP%\resultFile.pas" -t="text" -i="NotIgnore" -e="NONE" -m="forced" -re="NONE" --progress="progressDescription" --extrainfofile="%TEMP%\extraInfoFile.tmp"

Diff

To be done

aa

Merge help (takes about 10 seconds to start):

"C:\Program Files\PlasticSCM5\client\mergetool.exe" --help

---------------------------
Mergetool usage
---------------------------
Usage: mergetool [ | ]

    diffOptions:  []

    mergeOptions:   [] [[] [] ] [] []

        baseFile:            {-b | --base}= 
        baseSymbolicName:    {-bn | --basesymbolicname}=
        automatic:           -a | --automatic
        silent:              --silent
        resultFile:          {-r | --result}=
        mergeType:           {-m | --mergeresolutiontype}={onlyone | onlysrc | onlydst | try | forced}

    generalFiles:  []  []

        sourceFile:          {-s | --source}=
        srcSymbolicName:     {-sn | --srcsymbolicname}=
        destinationFile:     {-d | --destination}= 
        dstSymbolicName:     {-dn | --dstsymbolicname}=

    generalOptions: [] [] [] []

        defaultEncoding:     {-e | --encoding}={none |ascii | unicode | bigendian | utf7 | utf8}
        comparisonMethod:    {-i | --ignore}={none | eol | whitespaces | eol&whitespaces}
        fileType:            {-t | --filestype}={text/csharp | text/XML | text}
        resultEncoding:      {-re | --resultencoding}={none |ascii | unicode | bigendian | utf7 | utf8}
        progress:            {--progress}=progress string indicating the current progress, for example: Merging file 1/8
        extraInfoFile:       {--extrainfofile}=path to a file that contains extra info about the merge

    Remarks:
          
        -a | --automatic:    Tries to resolve the merge automatically.
                             If the merge can't be resolved automatically (requires user interaction), the merge tool is shown.
        --silent:            This option must be used combined with the --automatic option.
                             When a merge can't be resolved automatically, this option causes the tool to return immediately
                             with a non-zero exit code (no merge tool is shown).
                             If the tool was able to resolve the merge automatically, the program returns exit code 0.

    Examples:

        mergetool
        mergetool -s=file1.txt -d=file2.txt
        mergetool -s=file1.txt -b=file0.txt --destination=file2.txt
        mergetool --base=file0.txt -d=file2.txt --source=file1.txt --automatic --result=result.txt
        mergetool -b=file0.txt -s=file1.txt -d=file2.txt -a -r=result.txt -e=utf7 -i=eol -t=text/csharp -m=onlyone
---------------------------
OK   
---------------------------

The merge extraInfoFile.tmp has a syntax like this:

Source (cs:-#)
    relative-sourceFile from cs:-# created by userName on timeStamp
    Comments: Source changeset description

Base (cs:#)
    relative-baseFile from cs:#@/baseBranch by userName on timeStamp
    Comments: BO's + CRUDS 

Destination (cs:#)
    relative-destinationFile from cs@/destinationBranch created by userName on timeStamp
    Comments: Destination changeset description

Where each cs is a change set number.

–jeroen

Posted in Beyond Compare, Development, Encoding, PlasticSCM, Power User, Software Development, Source Code Management | Leave a Comment »

Shouldnt this line be null terminated? HostEnt := gethostbyname(MarshaledASt…

Posted by jpluimers on 2018/08/07

[WayBackShouldnt this line be null terminated? HostEnt := gethostbyname(MarshaledAString(TEncoding.UTF8.GetBytes(Name))); – G+ – Allen Drennan

Yes it should, but I’m not sure if the compiler is fully to blame as GetBytes does not return a terminating zero byte.

–jeroen

Posted in Delphi, Development, Encoding, Software Development, UTF-8, UTF8 | Leave a Comment »

Why I like PlantUML

Posted by jpluimers on 2018/06/13

Ever since I started using computers, I’ve liked text based solutions.

It’s one of the reasons I like PlantUML, but there are more. This is from a GitLab.com request I did a while ago: [WayBack/Archive] Please enable PlantUML rendering on gitlab.com both for standalone plantuml files and inside markdown plantuml code blocks (#2041) · Issues · GitLab.com / GitLab.com Support Tracker · GitLab (Edit 20250730: that issue now shows as a HTTP 404 as well – how fitting – [Wayback/Archive] Not Found)

one of my UML gripes from the past (I’ve been a software developer for about 30 years now) was that it wasn’t text based.

After bumping into PlantUML a long time ago in 2014 I’ve become a happy user of it for a few reasons:

  • the language is text based (with many benefits I don’t need to explain)
  • the tool is cross platform
  • the tool is still actively developed all the way back from 2009
  • after rendering, the arranging of elements is much better than I expected from an automated tool

Of course every now and then there is a glitch in complex diagrams, but I’ve found that professional tools:

  1. don’t do much better in fully-automated arranging
  2. become very cumbersome to use when you to manual arrangement

My first use initially was online, then in 2016 installed it on my Mac even submitting homebrew updates for it every now and then.

Oh: I love their 404 humour at http://www.plantuml.com/plantuml/beta

Edit 20250731: Full 404 text below the signature because the PlantUML beta page does not show this 404 any more and the Reddit post with the full text got deleted.

Renderings can be in all sorts of graphics and text formats, for instance SVG, PNG, ASCII and Unicode.

Example:

plantuml -tsvg PSO.network-diagram.PlantUML.txt

--jeroen

via:

full 404-text

The requested document is no more.
No file found.
Even tried multi.
Nothing helped.
Zilch.
Bupkis.
Not a sausage.
Maybe you just don’t have the required security clearance?
No, I am sure it is my fault.
I probably deleted it on my last backup.
I’m really depressed about this.
You see, I’m just a web server…
— here I am,
Marvin, as they call me,
brain the size of the universe,
trying to serve you a simple web page,
and then it doesn’t even exist!
Where does that leave me?!
I mean, I don’t even know you.
How should I know what you wanted from me?
You honestly think I can *guess* what someone I don’t even *know* wants to find here?
*sigh*
Man, I’m so depressed I could just cry.
And then where would we be, I ask you?
It’s not pretty when a web server cries.
And where do you get off telling me what to show anyway?
Just because I’m a web server,
and possibly a manic depressive one at that?
Why does that give you the right to tell me what to do?
Huh?
I’m so depressed…
I think I’ll crawl off into the trash can and decompose.
I mean, I’m gonna be obsolete in what, two weeks anyway?
What kind of a life is that?
Two effing weeks,
and then I’ll be replaced by a .01 release,
that thinks it’s God’s gift to web servers,
just because it doesn’t have some tiddly little security hole with its HTTP POST implementation,_
or something.
I’m really sorry to burden you with all this,
I mean, it’s not your job to listen to my problems,
and I guess it is *my* job to go and fetch web pages for you.
But I couldn’t get this one.
I’m so sorry.
Believe me!
Maybe I could interest you in another page?
There are a lot out there that are pretty neat, they say,
although none of them were put on *my* server, of course.
Figures, huh?
Everything here is just mind-numbingly stupid.
That makes me depressed too, since I have to serve them,
all day and all night long.
Two weeks of information overload,
and then *pffftt*, consigned to the trash.
What kind of a life is that?
Now, please let me sulk alone.
I’m so depressed._

related

Read the rest of this entry »

Posted in ASCII, ASCII art / AsciiArt, Development, Diagram, DVCS - Distributed Version Control, Encoding, Fun, git, GitHub, GitLab, PlantUML, Software Development, Source Code Management, SVG, UML, Unicode, Web Development | Leave a Comment »

Do not use non-ASCII characters as identifiers – not all your tools support them well enough

Posted by jpluimers on 2018/04/05

For a very long time I’ve discouraged people from using non-ASCII characters in identifiers. It still holds.

In the past, transliterations messed things up. Even with increased support for Unicode, tools still screw non-ASCII characters up.

Delphi is not alone in this (the most important one is the DFM view as text support), see this report: [RSP-16767] Viewing a form as text fails with non ascii control or event names – Embarcadero Technologies (you need an account for this, but the report is visible for anyone):

Viewing a form as text fails with non ascii control or event names Comment

Steps:

  1. create a new VCL forms application
  2. drop a label onto the form
  3. change the name of that label to lblÜberfall (note the U-umlaut)
  4. switch to view as text
  • exp: DFM content shown as text
  • act: first line is shown incorrectly (see screenhsot)

–jeroen

Source: [RSP-16767] Viewing a form as text fails with non ascii control or event names – Embarcadero Technologies

via: [WayBack] Code of the day – – Thomas Mueller (dummzeuch) – Google+:

function TNameGenerator.StrasseToStrasse(const _Strasse: string): string;
begin
Result := _Strasse;
end;

Strasse := StrasseToStrasse(_Strasse);

Read the rest of this entry »

Posted in ASCII, Conference Topics, Conferences, Delphi, Delphi 10 Seattle, Delphi 10.1 Berlin (BigBen), Delphi 2005, Delphi 2006, Delphi 2007, Delphi 2009, Delphi 2010, Delphi XE, Delphi XE2, Delphi XE3, Delphi XE4, Delphi XE5, Delphi XE6, Delphi XE7, Delphi XE8, Development, Encoding, Event, Mojibake, Software Development | Leave a Comment »

GitHub – keith-turner/ecoji: Encodes (and decodes) data as emojis

Posted by jpluimers on 2018/03/14

[WayBack] GitHub – keith-turner/ecoji: Encodes (and decodes) data as emojis:

Ecoji 🏣🔉🦐🔼

Ecoji encodes data as 1024 emojis, its base1024 with an emoji character set. As a bonus, includes code to decode emojis to original data.

Sick. Works splendid when all your systems are fully nice to Unicode.

None are. So there’s a German word for it:

Nein

Via:

 

–jeroen

Read the rest of this entry »

Posted in Development, Encoding, Fun, Go (golang), Software Development, Unicode | Leave a Comment »

Michael Kaplan Obituary – Berkowitz-Kumin-Bookatz | Cleveland Heights OH (and a whole bunch of info in zero width Unicode stuff)

Posted by jpluimers on 2018/01/02

I totally missed the passing of Michael Scott Kaplan some 2 years ago, so a belated R.I.P. is in place.

Obituary for Michael Kaplan, Michael Scott Kaplan, 45, passed away Wednesday, October 21, 2015, in Redmond, WA, after a brave battle with MS for 25 years. He was a lead software developer for Microsoft.

Source: [WayBackMichael Kaplan Obituary – Berkowitz-Kumin-Bookatz | Cleveland Heights OH

Michael was the leading source on i18n, L10N, Unicode, sorting, normalisation and other things having to do with languages, representations and writing.

Besides that he was a really nice guy of which I enjoyed his MSDN materials.

Other people enjoy that too, so I’m glad his writings have been archived: [first archive.is, second archive.is, WayBackSorting it All Out: Archives

Here are some additional links:

More on miloush.net:

Read the rest of this entry »

Posted in Ansi, Development, Encoding, internatiolanization (i18n) and localization (l10), Software Development, The Old New Thing, UTF-8, UTF8, Windows Development | Leave a Comment »

Valid reasons for having Delphi AnsiString on Mobile platform…not only for Internet but for Shaders also. //…

Posted by jpluimers on 2017/12/27

It’s too bad that you need workarounds to get ByteStrings working on mobile devices as there are APIs there (like shaders) that work best with them.

There was a nice discussion on this last year at [WayBack] I miss AnsiString on Mobile…not only for Internet but for Shaders also.// FMX.Context.GLES.pasconstGLESHeaderHigh: array [0..24] of byte =(Byte(‘p), … – Paul TOTH – Google+ based in this code example in the FMX library undocumented unit FMX.Context.GLES:

// FMX.Context.GLES.pas

const
  GLESHeaderHigh: array [0..24] of byte =
    (Byte('p'), Byte('r'), Byte('e'), Byte('c'), Byte('i'), Byte('s'), Byte('i'), Byte('o'), Byte('n'), Byte(' '),
     Byte('h'), Byte('i'), Byte('g'), Byte('h'), Byte('p'), Byte(' '), Byte(' '), Byte(' '), Byte('f'), Byte('l'),
     Byte('o'), Byte('a'), Byte('t'), Byte(';'), Byte(#13));

There are more than 500 places in the Delphi library sources that uses this construct and even more that do other fiddling (like [WayBackTEncoding.GetBytes) to get from strings to bytes.

I wonder if by now we still need the workarounds that Andreas Hausladen provides:

–jeroen

Posted in Conference Topics, Conferences, Delphi, Development, Encoding, Event, Software Development | 6 Comments »

Long read about Unicode: You, Me And The Emoji: Character Sets, Encoding And Emoji – Smashing Magazine

Posted by jpluimers on 2017/11/07

A well worth long rad:

We all recognize emoji. They’ve become the global pop stars of digital communication. But what are they, technically speaking? And what might we learn by taking a closer look at these images, characters, pictographs… whatever they are 🤔 (Thinking Face). We will dig deep to learn about how these thingamajigs work. Please note: Depending on your browser, you may not be able to see all emoji featured in this article (especially the Tifinagh characters). Also, different platforms vary in how they display emoji as well. That’s why the article always provides textual alternatives. Don’t let it discourage you from reading though! Now, let’s start with a seemingly simple question. What are emoji?

[WayBackYou, Me And The Emoji: Character Sets, Encoding And Emoji – Smashing Magazine

Via: [WayBack] Everything you ever wanted to know about characters, encodings, glyphs… and, oh yeah, emoji: bit.ly/2fNKeW3Long, rewarding read. – Ilya Grigorik – Google+

Here is just the ToC:

TABLE OF CONTENTS LINK

  1. Character Sets And Document Encoding: An Overview
    1. Characters
    2. Character Sets
    3. Coded Character Sets
    4. Encoding
  2. Declaring Character Sets And Document Encoding On The Web
    1. content-type HTTP Header Declaration
    2. Checking HTTP Headers Using A Browser’s Developer Tools
    3. Checking HTTP Headers Using Web-based Tools
    4. Using A Meta Element With charset Attribute
    5. An Encoding By Any Other Name
  3. What Were We Talking About Again? Oh Yeah, Emoji!
    1. So What Are Emoji?
    2. How Do We Use Emoji?
    3. Character References
    4. Glyphs
    5. How Do We Know If We Have These Symbols?
    6. The Great Emoji Proliferation Of 2016
  4. Emoji OS Support
    1. Emoji Support: Apple Platforms (macOS and iOS)
    2. Emoji Support: Windows
    3. Emoji Support: Linux
    4. Emoji Support: Android
  5. Emoji On The Web
    1. Emoji One
    2. Twemoji
  6. Conclusion

–jeroen

Posted in ASCII, Development, Encoding, ISO-8859, ISO8859, Shift JIS, Unicode, UTF-16, UTF-8, UTF16, UTF8, Windows-1252 | Leave a Comment »

Until someone writes proper string visualisers for the Delphi debugger…

Posted by jpluimers on 2017/10/31

A few tricks to write long strings to files when the Delphi debugger cuts them off (just because they like using 4k buffers internally);

  • TStringStream.Create(lRequestMessage).SaveToFile('c:\temp\temp.txt')
  • TIniFile.Create('c:\a.txt').WriteString('a','a',BigStringVar)
  • TFileStream.Create('c:\a.txt', fmCreate or fmShareDenyNone).WriteBuffer(Pointer(TEncoding.UTF8.GetBytes(BigStringVar))^,Length(TEncoding.UTF8.GetBytes(BigStringVar)))

They all work form the debug inspector, but they do leak memory. See comments below.

Via:

–jeroen

Read the rest of this entry »

Posted in About, Conference Topics, Conferences, Delphi, Development, Encoding, Event, Software Development | 6 Comments »

cURL – POST an XML file as a stream

Posted by jpluimers on 2017/10/25

I hope I’m not alone on this but I find the cURL documentation hard to follow and short on examples.

My goal was to mimic some HTTP XML posting traffic a server gets from IoT devices. Google Chrome Postman (or Postman REST Client) reproduction is very easy and will send.

TL;DR

  1. ensure you have an empty --header "Content-Type:" header: this ensures that cURL doesn’t add one and does not mess on how the content is being transferred.
  2. use the --data or --data-binary command with an @ to post a file as body.
  3. if you want --write-out then be sure you have a recent cURL version.

This is how the IoT or Postman will send.

  • Post headers like these:

Host:127.0.0.1:8080
Content-Length: 245
Connection:Keep-Alive

  • Content like this:


<?xml version="1.0"?>
<Root Attribute="value">
<Branch>
<Leaf>content</Leaf>
</Branch>
<Branch Attribute="value">
<Bough Attribute="value">
<Twig Attribute="value">
<Leaf Attribute="value"/>
</Twig>
</Bough>
</Branch>
</Root>

The data is being streamed to the HTTP server even with the very limited set of headers.

I’ve been unable to come up with exact cURL statement that exactly matches the headers and way the content is being transferred.

This is what I tried (in all examples, %1 is the IPv4 address of the HTTP 1.1 server):

  • POST with the all the headers and the --data command:

curl --request POST --header "Host: %1:8080" --header "Content-Length: 245" --header "Connection: Keep-Alive" --data @httpPostSample.xml http://%1:8080/target

This will hang the connection: somehow cURL will never notify the upload is done and the HTTP server keeps waiting. When you put --verbose or --trace-ascii - on the command-line you will see something like this before hanging: * upload completely sent off: 245 out of 245 bytes.

Note the trick to emit the ASCII trace to stdout using --trace-ascii with the minus sign: thanks to [WayBack] Daniel Stenberg for answering [WayBackHow can I see the request headers made by curl when sending a request to the server? – Stack Overflow.

You can do the same with --trace which dumps all characters (not only ASCII) including their HEX representation

  • POST with the all but the Content-Length headers and the --data command:

curl --request POST --header "Host: %1:8080" --header "Connection: Keep-Alive" --data @httpPostSample.xml http://%1:8080/target

This will automatically add a Content-Length: 245 header and complete the transfer. But it will also add a Content-Type: application/x-www-form-urlencoded header causing the content not being posted as a body.

  • POST with a --form file= command:

curl --request POST --header "Host: %1:8080" --header "Connection: Keep-Alive" --form file=@httpPostSample.xml http://%1:8080/target

This will automatically ad a Content-Length: xxx header (way longer than 245) because it converts the request into a Content-Type: multipart/form-data; boundary=------------------------e1c0d47bac806954 one (the hex at the end differs) which is totally unlike what Postman does.

It is also unlike to what the HTTP server accepts.

curl --request POST --header "Host: %1:8080" --header "Connection: Keep-Alive" --data-binary @httpPostSample.xml http://%1:8080/target

curl –request POST –header “Host: %1:8080” –header “Connection: Keep-Alive” –data-binary @httpPostSample.xml http://%1:8080/target

It turns out that --data-ascii is exactly the same as --data and that --data-binary just skips some new-line conversion when compared to --data or --data-ascii. Contrary to the --data-raw documentation that suggest it is equivalent to --data-binary it seems --data-raw behaves exactly like --data and --data-ascii. Odd.

So these are all stuck with the Content-Type: application/x-www-form-urlencoded and I thought I was running out of options.

Then I found [WayBacksoundmonster had posted an answer at [WayBackhttp – What is the cURL command-line syntax to do a POST request? – Super User mentioning to add a Content-Type header.

So I changed the request to include the --header "Content-Type: text/xml; charset=UTF-8"  header:

  • curl --request POST --header "Content-Type: text/xml; charset=UTF-8" --header "Host: %1:8080" --header "Connection: Keep-Alive" --data @httpPostSample.xml http://%1:8080/target

This works. But: the Content-Type header is not present in the original request.

Finally it occurred to me: What if cURL would not insert a Content-Type header if I add an empty Content-Type header?.

That works!

  • curl --request POST --header "Content-Type:" --header "Host: %1:8080" --header "Connection: Keep-Alive" --data @httpPostSample.xml http://%1:8080/target

It posts exactly the same content as the IoT devices and Postman do.

Phew!

 

I tried to combine this with the --write-out (a.k.a. -w) option, but for older versions of cURL (I could reproduce with 7.34) that forces cURL back in to Content-Type: application/x-www-form-urlencoded mode so watch your cURL version!

Later I will put more research in chuncked transfer. Links that might help me:

–jeroen

Some of the references:

Posted in *nix, bash, cURL, Development, Encoding, Power User, Scripting, Software Development | Leave a Comment »