Default comparers in Delphi used by TArray.Sort (via: Stack Overflow)
Posted by jpluimers on 2014/11/26
A long while ago, I posted a detailed answer on what functions the default comparers actually were calling to get a feel for if they would apply or not answering delphi – What does the default TArray.Sort comparator actually do and when would you use it? – Stack Overflow.
I needed that information recently because of some sorting issues I bumped into (sorting generic records), so finally a blog post.
First some links to documentation for even more background information:
- System.Generics.Collections:
– TArray.Sort method. - System.Generics.Defaults:
– IComparer interface with the Compare method.
– IEqualityComparer with the Equals and GetHashCode methods.
– _LookupVtableInfo.
– BinaryCompare.
– BobJenkinsHash.
– TOrdinalIStringComparer. - Generics Defaults TComparer (Delphi) – RAD Studio Code Examples.
There is the answer I gave:
David did a great job of textually describing how the default comparers work, but for some of you it might be easier to follow when you see how the underlying code is structured (and decide if the default comparers do apply).
I will cover only the
Compare_style of comparisons. TheEquals_style works in a similar way.What happens is that
_LookupVtableInfoselects anIComparerinterface forCompare_style comparisons (and aIEqualityComparerforEquals_style).Underneath those interfaces are not ordinary interfaces, but interface wrappers around global functions of this form for
Compare_style:function Compare_t<T>(Inst: Pointer; const Left, Right: T): Integer;and global procedures of this form for
Equals_style:function Equals_t<T>(Inst: Pointer; const Left, Right: T): Integer; function GetHashCode_t<T>(Inst: Pointer; const Left, Right: T): Integer;The outcome of
Compare_style functions is straightforward, but slightly different from -1, 0, +1 that some people might expect:< 0 for Left < Right = 0 for Left = Right > 0 for Left > RightFor the majority of cases, the implementation is very simple:
I have grouped the
Compare_style functions by how they do this.
- Ordinal types (including enumerators and Int64).
- Floating point (Real) types (including Comp and Currency).
- Short strings (from the Turbo Pascal / Delphi 1 days).
- Wide strings (the OLE style ones).
- Methods.
- Pointers (including Classes, Interfaces, Class References and Procedures).
(Ordinal types outside the range of 1, 2, 4, 8 bytes, and real types outside the range of 4, 8, 10 bytes raise an error as they are illegal).
The first group just subtracts
Left from RightRight from Left: signed/unsigned integers of 1 or 2 bytes lengthfunction Compare_I1(Inst: Pointer; const Left, Right: Shortint): Integer; function Compare_I2(Inst: Pointer; const Left, Right: Smallint): Integer; function Compare_U1(Inst: Pointer; const Left, Right: Byte): Integer; function Compare_U2(Inst: Pointer; const Left, Right: Word): Integer; Result := Left - Right;The second group does a comparison:
function Compare_I4(Inst: Pointer; const Left, Right: Integer): Integer; function Compare_I8(Inst: Pointer; const Left, Right: Int64): Integer; function Compare_U4(Inst: Pointer; const Left, Right: LongWord): Integer; function Compare_U8(Inst: Pointer; const Left, Right: UInt64): Integer; function Compare_R4(Inst: Pointer; const Left, Right: Single): Integer; function Compare_R8(Inst: Pointer; const Left, Right: Double): Integer; function Compare_R10(Inst: Pointer; const Left, Right: Extended): Integer; function Compare_RI8(Inst: Pointer; const Left, Right: Comp): Integer; function Compare_RC8(Inst: Pointer; const Left, Right: Currency): Integer; function Compare_WString(Inst: PSimpleInstance; const Left, Right: WideString): Integer; function Compare_Pointer(Inst: PSimpleInstance; Left, Right: NativeUInt): Integer; type {$IFNDEF NEXTGEN} TPS1 = string[1]; TPS2 = string[2]; TPS3 = string[3]; {$ELSE NEXTGEN} OpenString = type string; TPS1 = string; TPS2 = string; TPS3 = string; {$ENDIF !NEXTGEN} function Compare_PS1(Inst: PSimpleInstance; const Left, Right: TPS1): Integer; function Compare_PS2(Inst: PSimpleInstance; const Left, Right: TPS2): Integer; function Compare_PS3(Inst: PSimpleInstance; const Left, Right: TPS3): Integer; // OpenString allows for any String[n], see http://my.safaribooksonline.com/book/programming/borland-delphi/1565926595/5dot-language-reference/ch05-openstring function Compare_PSn(Inst: PSimpleInstance; const Left, Right: OpenString): Integer; if Left < Right then Result := -1 else if Left > Right then Result := 1 else Result := 0; function Compare_Method(Inst: PSimpleInstance; const Left, Right: TMethodPointer): Integer; var LMethod, RMethod: TMethod; begin LMethod := TMethod(Left); RMethod := TMethod(Right); if LMethod < RMethod then Result := -1 else if LMethod > RMethod then Result := 1 else Result := 0; end;Now we get to the interesting bits: the not-so-straightforward outcomes.
Strings use
CompareStr. If you want something different, you can useTOrdinalIStringComparerfunction Compare_LString(Inst: PSimpleInstance; const Left, Right: AnsiString): Integer; function Compare_UString(Inst: PSimpleInstance; const Left, Right: UnicodeString): Integer; Result := CompareStr(Left, Right);
BinaryCompareis used for:
- binary data including unknown, Char/WChar, Set, Array, Record. Exception if the binary data is 1, 2 or 4 bytes size in x86 and x64 and 8 bytes in x64, it will be compared as integers.
- dynamic carrays (be careful when they are multi-dimensional!).
- variants as a last resort (see further below)
For records that can be compared, it makes sense to perform operator overloading, and have the comparer use those operators.
Binary data of 1, 2, 4 or 8 bytes is an exception, which will give strange results on little-endian machines (Intel x86 and x64, and bi-endian Arm in little-endian mode):
function Comparer_Selector_Binary(info: PTypeInfo; size: Integer): Pointer; begin case size of // NOTE: Little-endianness may cause counterintuitive results, // but the results will at least be consistent. 1: Result := @Comparer_Instance_U1; 2: Result := @Comparer_Instance_U2; 4: Result := @Comparer_Instance_U4; {$IFDEF CPUX64} // 64-bit will pass const args in registers 8: Result := @Comparer_Instance_U8; {$ENDIF} else Result := MakeInstance(@Comparer_Vtable_Binary, size); end; end;The rest is pure binary:
function Compare_Binary(Inst: PSimpleInstance; const Left, Right): Integer; begin Result := BinaryCompare(@Left, @Right, Inst^.Size); end; function Compare_DynArray(Inst: PSimpleInstance; Left, Right: Pointer): NativeInt; var len, lenDiff: NativeInt; begin len := DynLen(Left); lenDiff := len - DynLen(Right); if lenDiff < 0 then Inc(len, lenDiff); Result := BinaryCompare(Left, Right, Inst^.Size * len); if Result = 0 then Result := lenDiff; end;As usual,
Variantsare in a league of their own. FirstVarCompareValueis tried. If that fails, thenCompare_UStringis tried. If that fails too,BinaryCompareis tried. If that fails: tough luck.function Compare_Variant(Inst: PSimpleInstance; Left, Right: Pointer): Integer; var l, r: Variant; lAsString, rAsString: string; begin Result := 0; // Avoid warning. l := PVariant(Left)^; r := PVariant(Right)^; try case VarCompareValue(l, r) of vrEqual: Exit(0); vrLessThan: Exit(-1); vrGreaterThan: Exit(1); vrNotEqual: begin if VarIsEmpty(L) or VarIsNull(L) then Exit(1) else Exit(-1); end; end; except // if comparison failed with exception, compare as string. try lAsString := PVariant(Left)^; rAsString := PVariant(Right)^; Result := Compare_UString(nil, lAsString, rAsString); except // if comparison fails again, compare bytes. Result := BinaryCompare(Left, Right, SizeOf(Variant)); end; end; end;
–jeroen






magnoz said
typo: “The first group just subtracts Left from Right” != “Result := Left – Right;”
jpluimers said
Thanks. Fixed.