The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,651 other followers

XOR swap/exchange: nowadays an almost extinct means to exchange two distinct variables of the same size

Posted by jpluimers on 2013/12/19

Almost a year ago, a thread on “premature Delphi optimization” came by on G+ about this code:

procedure ExchangeInteger(var AValue1, AValue2: Integer);
begin
  AValue1 := AValue1 xor AValue2;
  AValue2 := AValue1 xor AValue2;
  AValue1 := AValue1 xor AValue2;
end;

I don’t think that was premature optimization, just some code from an old fart that had already been programming in the era where processors had reasons to use it:

Back then, the only efficient way to exchange two variables of the same data type was using the XOR swap algorithm.

Nowadays you have more options, and this is where the fun in that thread began, which I will show in a minute.

First a bit of history

The XOR swap algorithm was widely known in the 80s of last century and before, especially because the 6502 processor (oh the days of LISA Assembler) was vastly popular, as was the Z80. Together, they powered the majority of the home computers in the 70s and 80s.

Popular 6502 powered computers were Acorn Atom and BBC, Apple II series, Commodore PET and VIC-20 (the Commodore 64 ran on a 6510),  Atari 400/800/XL/XE.

Popular Z80 powered computers were Amstrad CPC, MSX, Exidy Sorcerer,  TRS-80, P2000, Sinclair ZX80ZX81 and ZX Spectrum, KayproOsborne 1 and the Z-80 SoftCard for Apple II.

So back-then you needed the XOR algorighm to do the exchange.

Now the fun part: more choice

Two more variations for the method were proposed.

The first one was to use the XCHG instruction, which has been there since the 8086.

There is even an XCHG EAX, EAX instruction with opcode $90, which is the same as NOP.

Stefan Glienke proposed it, as Delphi compiler adds a MOV with every XOR.

procedure ExchangeInteger(var AValue1, AValue2: Integer);
asm
  xchg ecx, [eax]
  xchg ecx, [edx]
  xchg [eax], ecx
end;

You’d think this is fast: assembler + tiny XCHG instruction. Well, don’t think, but measure: this one is slower, even on modern systems.

Eric Grange did notice the speed difference and came up with this one:

procedure ExchangeInteger(var AValue1, AValue2: Integer);
var
  tmp: Integer;
begin
  tmp := AValue1;
  aValue1 := AVAlue2;
  aValue2 := tmp;
end;

Stefan Glienke concluded: The xor version is faster than the xchg thing. And the version with tmp var is faster (with optimization on ofc).

Real conclusion

The real conclusion however was made by Kenneth Cochran:

I’d leave this one up to the compiler to optimize.

–jeroen

via: Stefan Meisner – Google+ – if premature optimization is the root of all evil then….

7 Responses to “XOR swap/exchange: nowadays an almost extinct means to exchange two distinct variables of the same size”

  1. […] I already wrote a bit on the Z80 processor in XOR swap/exchange: nowadays an almost extinct means to exchange two distinct variables of the same s…. […]

  2. woutervannifterick said

    Reminds me of old days, where

    xor ax,ax

    was faster than:

    mov ax 0

    I didn’t measure it, but I wouldn’t be surprised if this same trick is now actually slower on 64 bits registers.

    Porting to win64 forced me to deal with all places where I had 32bits assembly. Instead of writing 64bits versions, I’ve replaced almost all of it with higher level (Delphi) code, and leave optimization tricks up to the compiler as much as possible. Overall I think it was an improvement for maintainability and portability..

    • jpluimers said

      Since porting from 16-bit to 32-bit I’ve ditched virtually all ASM code, and I’m glad I did: it makes it much easier to go to Android and iOS now.

  3. Anders Andersen said

    I did an implementation of a hand analysis algorithm for hold em poker a while back. It was a conversion from C, if I remember correctly. It involved a huge precalculated lookup table of hand combinations perfectly hashed for instant lookup. Now the C code had the particular XOR swap algoritm which for various reasons needed to be called a lot. In my Delphi implementation I replaced it with the naive temp var algoritm, because I suspected it would actually be faster. Speed tests of the entire algoritm showed that with the temp var swap algorithm, the code was around 5% faster than with the XOR routine.

  4. Eric said

    Alas, the real conclusion doesn’t always hold.

    While “trust the compiler” is true for exchanging integers, it won’t if you try to exchange:
    – interfaces or strings
    – most static arrays
    – non trivial records
    – floats in the 32bits compiler
    – objects in the NextGen ARC compiler
    – etc.

    When it comes to performance, measure, benchmark and profile, everything else is cargo cult ;-)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: