Delphi – MD5: the MessageDigest_5 unit has been there since Delphi 2007
Posted by jpluimers on 2009/12/11
I still see a lot of people crafting their own MD5 implementation.
A lot of the existing MD5 implementations do not work well in Delphi 2009 and later (because they need to be adapted to Unicode).
Many of those existing implementations behave differently if you pass the same ASCII characters as AnsiString or UnicodeString.
The MessageDigest_5 unit has been available in Delphi since Delphi 2007.
This is the location relative to your installation directory: source\Win32\soap\wsdlimporter\MessageDigest_5.pas
(Edit: 20091223: Since Delphi 7.01, Indy has provided the unit IdHashMessageDigest which also does md5, see the comments below)
So this unit used by the WSDL, and more importantly: works with Unicode (if you pass it a string with Unicode characters, it will convert them to UTF-8 first).
The unit is not in your default search path, and has not been very well promoted (the only link at the Embarcadero site was an article by Pawel Glowacki), so few people know about it.
Now you know too :-)
Note that MD5 is normally used to hash binary data.
It is not wise to send a non ASCII string through both the AnsiString and UnicodeString versions: because of the different encoding (and therefore a different binary representation), you will get different results depending on the Delphi version used.
A sample of the usage showing the above AnsiString/UnicodeString issue is not present for ASCII strings, nor for ANSI strings: this is because both get encoded using UTF-8 before hashing.
Delphi 2007 did not do the UTF-8 encoding, so you will see different results here.
You will also see that Writeln uses the Console for encoding, and those are different than the code editor.
Edit: 20091216 – added RawByteString example to show that the conversion does not matter.
<br />program md5;<br /><br />{$APPTYPE CONSOLE}<br /><br />uses<br /><%%KEEPWHITESPACE%%> SysUtils,<br /><%%KEEPWHITESPACE%%> MessageDigest_5 in 'C:\Program Files\Embarcadero\RAD Studio\7.0\source\Win32\soap\wsdlimporter\MessageDigest_5.pas';<br /><%%KEEPWHITESPACE%%> // Vista/Windows 7: MessageDigest_5 in 'C:\Program Files (x86)\Embarcadero\RAD Studio\7.0\source\Win32\soap\wsdlimporter\MessageDigest_5.pas';<br /><br />function GetMd5(const Value: AnsiString): string; overload;<br />var<br /><%%KEEPWHITESPACE%%> hash: MessageDigest_5.IMD5;<br /><%%KEEPWHITESPACE%%> fingerprint: string;<br />begin<br /><%%KEEPWHITESPACE%%> hash := MessageDigest_5.GetMD5();<br /><%%KEEPWHITESPACE%%> hash.Update(Value);<br /><%%KEEPWHITESPACE%%> fingerprint := hash.AsString();<br /><%%KEEPWHITESPACE%%> Result := LowerCase(fingerprint);<br />end;<br /><br />function GetMd5(const Value: UnicodeString): string; overload;<br />var<br /><%%KEEPWHITESPACE%%> hash: MessageDigest_5.IMD5;<br /><%%KEEPWHITESPACE%%> fingerprint: string;<br />begin<br /><%%KEEPWHITESPACE%%> hash := MessageDigest_5.GetMD5();<br /><%%KEEPWHITESPACE%%> hash.Update(Value);<br /><%%KEEPWHITESPACE%%> fingerprint := hash.AsString();<br /><%%KEEPWHITESPACE%%> Result := LowerCase(fingerprint);<br />end;<br /><br />var<br /><%%KEEPWHITESPACE%%> SourceAnsiString: AnsiString;<br /><%%KEEPWHITESPACE%%> SourceUnicodeString: UnicodeString;<br /><%%KEEPWHITESPACE%%> SourceRawByteString: RawByteString;<br /><br />begin<br /><%%KEEPWHITESPACE%%> try<br /><%%KEEPWHITESPACE%%> SourceAnsiString := 'foobar';<br /><%%KEEPWHITESPACE%%> SourceUnicodeString := 'foobar';<br /><%%KEEPWHITESPACE%%> SourceRawByteString := 'foobar';<br /><br /><%%KEEPWHITESPACE%%> Writeln(GetMd5(SourceAnsiString));<br /><%%KEEPWHITESPACE%%> Writeln(GetMd5(SourceUnicodeString));<br /><%%KEEPWHITESPACE%%> Writeln(GetMd5(SourceRawByteString));<br /><br /><%%KEEPWHITESPACE%%> SourceAnsiString := 'föøbår';<br /><%%KEEPWHITESPACE%%> SourceUnicodeString := 'föøbår';<br /><%%KEEPWHITESPACE%%> SourceRawByteString := 'föøbår';<br /><%%KEEPWHITESPACE%%> Writeln(SourceAnsiString, ' ', GetMd5(SourceAnsiString));<br /><%%KEEPWHITESPACE%%> Writeln(SourceUnicodeString, ' ', GetMd5(SourceUnicodeString));<br /><%%KEEPWHITESPACE%%> Writeln(SourceRawByteString, ' ', GetMd5(SourceRawByteString));<br /><%%KEEPWHITESPACE%%> except<br /><%%KEEPWHITESPACE%%> on E: Exception do<br /><%%KEEPWHITESPACE%%> Writeln(E.ClassName, ': ', E.Message);<br /><%%KEEPWHITESPACE%%> end;<br />end.<br />–jeroen
Tom Borysiak said
I hope this can help anyone else who may run into this, Marcel’s changes to get Peter’s code to compile can result in inconsistent results. I would randomly get extra chars returned in the DigestStr. I found that changing TDigestStr = String[0..32]; —> TDigestStr = Array[0..32] of Char; will allow the code to compile and produces correct results everytime.
jpluimers said
Thanks for that!
Marcel Simunek said
And this:
Procedure TMD5.Add (Const Value: String);
Begin
Update(PChar(RawByteString(Value))^, Length(RawByteString(Value)));
End;
Guus Creuwels said
Hi,
Is the speed of the MessageDigest_5 functions the same or better as the MD5 functions from http://www.sawatzki.de/Download/Delphi_MD5.zip?
I used Indy version but had some performance issues with calculating the hash of large files. The md5 methods from http://www.sawatzki.de are much faster.
Thanks.
Regards,
Guus
jpluimers said
Peter Sawatzki writes very optimal code, so I think his code will be faster.
Right now I don’t have time to measure, can you try to measure?
–jeroen
GuusCreuwels said
Unfortunately the md5 units from Peter do not compile in Delphi 2010. That’s how I landed on this page…
jpluimers said
Please drop me an email about this; I’ll try to look at it this weekend.
–jeroen
Marcel Simunek said
Only a few changes and it will work ;-) See Marco Cantu’s Delphi 2009 handbook and will see which changes come in D2009 ( D2010 also ;-) ) ,..
Type
TDigestStr = Array[0..32] of AnsiChar;
and function GetDigestStr:
Function TCustomMD5.GetDigestStr: TDigestStr;
Const
hc: Array[0..$F] Of AnsiChar = '0123456789abcdef';
Var
aDigest: TDigest;
i: 0..15;
Begin
aDigest:= Digest;
// Result[0]:= #32;
For i:= 0 To 15 Do Begin
Result[0+i Shl 1] := hc[aDigest[i] Shr 4];
Result[1+i Shl 1] := hc[aDigest[i] And $F];
End
End;
works it now? I think yes ;-)
Bye, Marcel – Czech republic :)
Alan said
Sorry, it should be replaced with lineText rather than Value:
ReadLn(myFile, lineText);
hash.Update(lineText);
ReadLn(myFile, lineText);
hash.Update(lineText);
ReadLn(myFile, lineText);
hash.Update(lineText);
fingerprint := hash.AsString();
Result := LowerCase(fingerprint);
Marry said
Hi, can you write me great procedure to get MD5 from a file? Right now I am using File2String procedure then use your function. It consumes much memory and slow.
Thx.
jpluimers said
Just feed the MD5 engine a string from the file at a time and you should be fine!
–jeroen
Alan said
Are you saying something like this if my file has 3 lines of text:
var
hash: MessageDigest_5.IMD5;
fingerprint: string;
myFile: TextFile;
lineText: String;
……
begin
hash := MessageDigest_5.GetMD5();
…..
ReadLn(myFile, lineText);
hash.Update(Value);
ReadLn(myFile, lineText);
hash.Update(Value);
ReadLn(myFile, lineText);
hash.Update(Value);
fingerprint := hash.AsString();
Result := LowerCase(fingerprint);
end;
jpluimers said
Indeed.
–jeroen
Marcel Simunek said
Use Peter Sawatzki’s code – there are functions:
Morwath said
Maybe in Delphi 2014 we’ll see a SHA hash…
jpluimers said
SHA has been there since at least Delphi 2005:
D2005\source\Indy10\Protocols\IdHashSHA1.pas
–jeroen
nader said
thanks
for this tip. it help me very well.
apz28 said
Why did it not use RawString type and ignore the conversion all together
jpluimers said
Actually, it is RawByteString, and in this case the conversion does not matter.
I have added a RawByteString to the example: the results are the same as for UnicodeString and AnsiString.
–jeroen
Gad D Lord said
Perfect. So far I have used the
IdHashCrc.pas
IdHashMessageDigest.pas
IdHashSHA1.pas
methods from Indy source. They also have SHA-1 and CRC-16, CRC-32.
jpluimers said
Duh – I totally forgot that Indy has that as well.
And even better: IdHashMessageDigest.pas has been there since Delphi 7:
D7.01.Architect\Source\Indy\IdHashMessageDigest.pas
–jeroen
Yogi Yang said
Thanks for this jewel.
Peter Bartholdsson said
And I see you write ASCII above, reading too fast as usual.
Still gut reaction is don’t use as it doesn’t actually produce the expected MD5 for an unicode string. ;)
Peter Bartholdsson said
Actually it’ll only produce the same result if you’re using ASCII characters as it converts the unicode string to UTF-8.
Using this class to produce a MD5 of string values seems risky to me. Do it properly, don’t expect a unicodestring and ansistring to produce the same MD5, because they most certainly shouldn’t.
The following example should produce different MD5 checksums (using latin-1 / ISO/IEC 8859-1 as your ansistring locale):
SourceAnsiString := ‘fööbar’;
SourceUnicodeString := ‘fööbar’;
jpluimers said
Actually, they don’t. At least not when used in Delphi 2009 or 2010. The reason is that both strings get converted to UTF-8 before hashing.
Delphi 2007 does not do that conversion, so you will see different results between Delphi 2007 and 2009/2010.
But you should normally only use md5 for hashing binary data.
–jeroen
IL said
Thank you for info, Jeroen. But, oops! RAD Studio 7.0 is Delphi 2010, not 2009. Nor 2007 trial version, nor 2009 does contain MessageDigest_5 unit source or compiled.
jpluimers said
Actually, since I have the RTL and VCL sources for all Delphi versions in a central place:
RTL-VCL-Sources\D2007\source\Win32\soap\wsdlimporter\MessageDigest_5.pas
RTL-VCL-Sources\D2009\source\Win32\soap\wsdlimporter\MessageDigest_5.pas
RTL-VCL-Sources\D2010\source\Win32\soap\wsdlimporter\MessageDigest_5.pas
Maybe I have them because I always have the Enterprise or Architect edition.
–jeroen
Dimitrij said
Thanks. I had no idea about its existance…:)