The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 2,087 other followers

Archive for the ‘Assembly Language’ Category

A Tale of Many Divisions – Naive Prime Factorization Across a Handful of Architectures

Posted by jpluimers on 2020/06/16

[WayBack] A Tale of Many Divisions – Naive Prime Factorization Across a Handful of Architectures

Source code: [WayBack] GitHub – blu/euclid: An extremely naive prime factorizer

Via: [WayBack] Blu looks at how a small piece of code with divisions surprisingly behave on various architecture: #Arm, #MIPS, and #x86. – Jean-Luc Aufranc – Google+

–jeroen

Posted in Assembly Language, C++, Development, Software Development | Leave a Comment »

Insentricity :: Kermit on the JAIR 8080 ::

Posted by jpluimers on 2019/05/22

Cool: [WayBackInsentricity :: Kermit on the JAIR 8080 ::

Repository: [WayBackFozzTexx/Kermit-CPM: Columbia University’s Kermit for CP/M

–jeroen

Posted in Assembly Language, Development, History, Software Development | Leave a Comment »

Some notes on loosing performance because of using AVX

Posted by jpluimers on 2019/03/20

It looks like AVX can be a curse most of the times. Below are some (many) links that lead me to this conclusion, based on a thread started by Kelly Sommers.

My conclusion

Running AVX instructions will affect the processor frequency, which means that non-AVX code will slow down, so you will only benefit when the gain of using AVX code outweighs the non-AVX loss on anything running on that processor in the same time frame.

In practice, this means you need to long term gain from AVX on many cores. If you don’t, then the performance penalty on all cores, including the initial AVX performance, will degrade, often a lot (dozens of %).

Tweets and pages linked by them

Kelly raised a bunch of interesting questions and remarks because of the above:

I collected the above links because of [WayBack] GitHub – maximmasiutin/FastMM4-AVX: FastMM4 fork with AVX support and multi-threaded enhancements (faster locking), where it is unclear which parts of the gains are because of AVX and which parts are because of other optimizations. It looks like that under heavy loads on data center like conditions, the total gain is about 30%. The loss for traditional processing there has not been measured, but from the above my estimate it is at least 20%.

Full tweets below.

Read the rest of this entry »

Posted in Assembly Language, Development, Software Development, x64, x86 | Leave a Comment »

performance – Why is this C++ code faster than my hand-written assembly for testing the Collatz conjecture? – Stack Overflow

Posted by jpluimers on 2019/02/28

Geek pr0n at [WayBackperformance – Why is this C++ code faster than my hand-written assembly for testing the Collatz conjecture? – Stack Overflow

Via: [WayBack] Very nice #Geekpr0n “Why is C++ faster than my hand-written assembly code?” The comments are of high quality i… – Jan Wildeboer – Google+

–jeroen

Posted in Assembly Language, C, C++, Development, Software Development, x64, x86 | Leave a Comment »

A refefernce to 6502 by “Remember that in a stack trace, the addresses are return addresses, not call addresses – The Old New Thing”

Posted by jpluimers on 2018/09/11

On x86/x64/ARM/…:

It’s where the function is going to return to, not where it came from.

And:

Bonus chatter: This reminds me of a quirk of the 6502 processor: When it pushed the return address onto the stack, it actually pushed the return address minus one. This is an artifact of the way the 6502 is implemented, but it results in the nice feature that the stack trace gives you the line number of the call instruction.

Of course, this is all hypothetical, because 6502 debuggers didn’t have fancy features like stack traces or line numbers.

Source: [WayBackRemember that in a stack trace, the addresses are return addresses, not call addresses – The Old New Thing

Which resulted in these comments at [WayBack] CC +mos6502 – Jeroen Wiert Pluimers – Google+:

  • mos6502: And don’t forget the crucial difference in PC on 6502 between RTS and RTI!
  • Jeroen Wiert Pluimers: +mos6502 I totally forgot about that one. Thanks for reminding me
    <<Note that unlike RTS, the return address on the stack is the actual address rather than the address-1.>>

References:

[WayBack6502.org: Tutorials and Aids – RTI

RTI retrieves the Processor Status Word (flags) and the Program Counter from the stack in that order (interrupts push the PC first and then the PSW).

Note that unlike RTS, the return address on the stack is the actual address rather than the address-1.

[WayBack6502.org: Tutorials and Aids – RTS

RTS pulls the top two bytes off the stack (low byte first) and transfers program control to that address+1. It is used, as expected, to exit a subroutine invoked via JSR which pushed the address-1.

RTS is frequently used to implement a jump table where addresses-1 are pushed onto the stack and accessed via RTS eg. to access the second of four routines.

–jeroen

Posted in 6502, 6502 Assembly, Assembly Language, Development, History, Software Development, The Old New Thing, Windows Development, x64, x86 | Leave a Comment »

 
%d bloggers like this: