The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 2,095 other followers

Rewritten version (free for non-commercial; small price for commercial use) GitHub – pleriche/FastMM5: FastMM is a fast replacement memory manager for Embarcadero Delphi applications that scales well across multiple threads and CPU cores, is not prone to memory fragmentation, and supports shared memory without the use of external .DLL files.

Posted by jpluimers on 2020/05/05

It has been mentioned a few times already, but for my link archive: [WayBack] GitHub – pleriche/FastMM5: FastMM is a fast replacement memory manager for Embarcadero Delphi applications that scales well across multiple threads and CPU cores, is not prone to memory fragmentation, and supports shared memory without the use of external .DLL files.

From the [WayBack] README.md:

Version 5 is a complete rewrite of FastMM. It is designed from the ground up to simultaneously keep the strengths and address the shortcomings of version 4.992:

  • Multithreaded scaling across multiple CPU cores is massively improved, without memory usage blowout. It can be configured to scale close to linearly for any number of CPU cores.
  • In the Fastcode memory manager benchmark tool FastMM 5 scores 15% higher than FastMM 4.992 on the single threaded benchmarks, and 30% higher on the multithreaded benchmarks. (I7-8700K CPU, EnableMMX and AssumeMultithreaded options enabled.)
  • It is fully configurable runtime. There is no need to change conditional defines and recompile to change options. (It is however backward compatible with many of the version 4 conditional defines.)
  • Debug mode uses the same debug support library as version 4 (FastMM_FullDebugMode.dll) by default, but custom stack trace routines are also supported. Call FastMM_EnterDebugMode to switch to debug mode (“FullDebugMode”) and call FastMM_ExitDebugMode to return to performance mode. Calls may be nested, in which case debug mode will be exited after the last FastMM_ExitDebugMode call.
  • Supports 8, 16, 32 or 64 byte alignment of all blocks. Call FastMM_EnterMinimumAddressAlignment to request a minimum block alignment, and FastMM_ExitMinimumAddressAlignment to rescind a prior request. Calls may be nested, in which case the coarsest alignment request will be in effect.
  • All event notifications (errors, memory leak messages, etc.) may be routed to the debugger (via OutputDebugString), a log file, the screen or any combination of the three. Messages are built using templates containing mail-merge tokens. Templates may be changed runtime to facilitate different layouts and/or translation into any language. Templates fully support Unicode, and the log file may be configured to be written in UTF-8 or UTF-16 format, with or without a BOM.
  • It may be configured runtime to favour speed, memory usage efficiency or a blend of the two via the FastMM_SetOptimizationStrategy call.

Licence

FastMM 5 is dual-licensed. You may choose to use it under the restrictions of the GPL v3 licence at no cost to you, or you may purchase a commercial licence. A commercial licence includes all future updates. The commercial licence pricing is as follows:

Number Of Developers Price (USD)
1 developer $99
2 developers $189
3 developers $269
4 developers $339
5 developers $399
More than 5 developers $399 + $50 per developer from the 6th onwards
Site licence (unlimited developers) $999

Please send an e-mail to fastmm@leriche.org to request an invoice before or after payment is made at https://www.paypal.me/fastmm (paypal@leriche.org). Support is available for users with a commercial licence via the same e-mail address.

FastMM4 is still free ([WayBack] GitHub – pleriche/FastMM4: A memory manager for Delphi and C++ Builder with powerful debugging facilities), but I recommend to consider switching as I think the focus will be on FastMM5.

It was made public a few days ago, but has had commits for months: [WayBack] Commits · pleriche/FastMM5 · GitHub

–jeroen

6 Responses to “Rewritten version (free for non-commercial; small price for commercial use) GitHub – pleriche/FastMM5: FastMM is a fast replacement memory manager for Embarcadero Delphi applications that scales well across multiple threads and CPU cores, is not prone to memory fragmentation, and supports shared memory without the use of external .DLL files.”

  1. Maxim Masiutin said

    Here is the single-threading performance comparison between FastMM5 (FastMM v5.01 dated Jun 12, 2020 and FastMM4-AVX v1.03 dated Jun 14, 2020). This test is run on Jun 16, 2020, under Intel Core i7-1065G7 CPU (base frequency: 1.3 GHz, 4 cores, 8 threads). Compiled under Delphi 10.3 Update 3, 64-bit target.

                                             FastMM5  AVX-br.   Ratio
                                              ------  ------   ------
    ReallocMem Small (1-555b) benchmark         9285    7013   24.47%
    ReallocMem Medium (1-4039b) benchmark      12002   10186   15.13%
    Block downsize                             12463    9474   23.98%
    VerySmall downsize benchmark               12025   11012    8.42%
    Address space creep benchmark              14212   10845   23.69%
    Address space creep (larger blocks)        16237   13629   16.06%
    Single-threaded reallocate and use         15462   13750   11.07%
    Single-threaded tiny reallocate and use     9263    7203   22.24%
    Single-threaded allocate, use and free     14885   14211    4.53%
    

    You can find the program, used to generate the benchmark data,at https://github.com/maximmasiutin/FastCodeBenchmark

    You can find FastMM4-AVX branch at https://github.com/maximmasiutin/FastMM4-AVX

    On the tests above demonstrated, FastMM4-AVX branch is faster than FastMM5.

    Besides that, FastMM5 uses “Winapi.Windows.SwitchToThread” call in multi-threading in an attempt to obtain a lock of a block manager. The “SwitchToThread” call is not a very efficient way in a spin-lock loop. A better way, even recommended by Intel, is to use “pause” instruction, e.g. 5000 times, and only then if it would not help, call “SwitchToThread”. Usually, “pause” will help and the spin-lock will release before reaching 5000 iterations, so no “SwitchToThread” call will be needed.

    The following should also be taken into consideration: (1) Each call to SwitchToThread() experiences the expensive cost of a context switch, which can be 10000+ cycles; (2) It also suffers the cost of ring 3 to ring 0 transitions, which can be 1000+ cycles; (3) SwitchToThread() may be of no use if no threads are in the ready state.

    The FastMM4-AVX branch checks if the CPU supports SSE2 and thus the “pause” instruction, it uses “pause” spin-loop for 5000 iterations before calling “SwitchToThread”. If a CPU doesn’t have the “pause” instruction or Windows doesn’t have the SwitchToThread() API function, it will use EnterCriticalSection/LeaveCriticalSection.

  2. FastMM5 not free said

    Not sure why you didn’t publish my comment that FastMM is not free for non-commerical, only for GPL3 code. There are other open source licences, that are incompatible with GPL3. And of course, software that is free (non-commercial), but not opensource.

    • jpluimers said

      During my fight with rectum cancer, I will only occasionally run through blog comments.

      Usually I leave comments from anonymous commenters unpublished.

  3. None said

    Free for non-commerical is misleading, since it is GPL3. What you mean is free only for GPL3 code, which is quite restrictive.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
%d bloggers like this: