“fixing” ESXi “rsync error: error allocating core memory buffers (code 22) at util2.c(106) [sender=3.1.2]”
Posted by jpluimers on 2021/08/30
Reminder to self: create a static ESXi binary for a recent rsync release.
Quite a few people have bumped into rsync
erroring out with “large” sets of files (where large can be as low as ~1000), like for instance Tj commenting on my post “ESXi 5.1 and rsync – damiendebin.net.”:
ERROR: out of memory in receive_sums [sender] │······
rsync error: error allocating core memory buffers (code 22) at util2.c(102) [sender=3.1.1] │······
rsync: [generator] write error: Broken pipe (32) │······
I bumped into this myself as well, even when updating from rsync 3.1.0 to 3.1.2.
There are various static
rsync
for ESXi around. Just a few of them for completeness:
- 3.1.0: ESXi 5.1 and rsync – damiendebin.net. (how to download the build by Damien Debin from http://damiendebin.net and how to create the right XML firewall settings from his [Wayback] gist)
- 3.1.2: [Wayback] DOWNLOADS – bachmann-lan.de via [Wayback] VMware ESXi 5.1 rsync 3.0.9 statically linked binary erstellen – bachmann-lan.de
- 3.1.3: [Wayback/Archive.is] noelmartinon/vmtools: Tools for VMware ESXi to use in ESXi which pointed at the wrong binary link, so I archived the right one in the Wayback machine.
There is also 3.0.9 (via [Wayback] VMware ESXi 5.1 rsync 3.0.9 statically linked binary erstellen – bachmann-lan.de), but it has a VMFS bug ([Wayback] 8177 – Problems with big sparsed files) as per [Wayback] ESXi 5.1 and rsync – damiendebin.net.)
The good news is that it is fixed in 3.2.2 as a user-configurable setting, but since there is no ESXi build yet (see reminder above)…
Anyway: [Wayback] 12769 – error allocating core memory buffers (code 22) depending on source file system
Wayne Davison 2020-06-26 03:56:35 UTCI fixed the allocation args to be size_t values (and improved a bunch of allocation error checking while I was at it). I then added an option that lets you override this allocation sanity-check value. The default is still 1G per allocation, but you can now specify a much larger value (up to "--max-alloc=8192P-1"). If you want to make a larger value the default for your copies, export RSYNC_MAX_ALLOC in the environment with the size value of your choice. Committed for release in 3.2.2.
This is what happens with 3.1.2 and 3.1.3:
time rsync -aiv --info=progress2 --progress --partial --existing --inplace /vmfs/volumes/Samsung850-2TB-S3D4NX0HA01043L/ Samsung850-2TB-S3D4NX0HA01043L/ sending incremental file list 0 0% 0.00kB/s 0:00:00 (xfr#0, ir-chk=1000/1259) ERROR: out of memory in flist_expand [sender] rsync error: error allocating core memory buffers (code 22) at util2.c(106) [sender=3.1.2] Command exited with non-zero status 22 real 0m 0.87s user 0m 0.10s sys 0m 0.00s time rsync -aiv --info=progress2 --progress --partial --ignore-existing --sparse /vmfs/volumes/Samsung850-2TB-S3D4NX0HA01043L/ Samsung850-2TB-S3D4NX0HA01043L/ sending incremental file list 0 0% 0.00kB/s 0:00:00 (xfr#0, ir-chk=1000/1259) ERROR: out of memory in flist_expand [sender] rsync error: error allocating core memory buffers (code 22) at util2.c(106) [sender=3.1.2] Command exited with non-zero status 22 real 0m 0.28s user 0m 0.12s sys 0m 0.00s Finished
I was lucky that [Wayback] “rsync error: error allocating core memory buffers” protocol version “3.1.2” – Google Search got me a result so quickly: add a --protocol-29
and you are set.
The first result (Wayback has the results reversed from what got) didn’t fix it. The second did.
- [Wayback] 225761 – net/rsync long path causes buffer overflow (update to 3.1.3)
- [Wayback/Archive.is] AIX Open Source – IBM Power Systems Community: rsync out of memory
As a work around, I added “
--protocol=29
” to one of our servers that was consistently failing with “ERROR: out of memory in flist_expand [receiver]
” “rsync error: error allocating core memory buffers (code 22) at util2.c(105) [receiver=3.1.3]
” in rsync-3.1.3-2.ppcI read the man page and started experimenting with the protocol version until I lowered it enough to get it to work consistently.
The problem might be that running on the ESXi gives you limited memory, but then some 10k files should not use more than like half a megabyte of memory.
Sometime I will dig deeper into the protocol version differences, for now a list of files I think will be relevant for that (mainly look for protocol_version
):
- [Wayback/Archive.is] csprotocol.txt
- [Wayback/Archive.is] main.c
- [Wayback/Archive.is] sender.c
- [Wayback/Archive.is] receiver.c
- [Wayback/Archive.is] compat.c
- [Wayback/Archive.is] clientserver.c
- [Wayback/Archive.is] authenticate.c
- [Wayback/Archive.is] rsync.h
Some web pages mentioning the --protocol
option and might give me more insight in the protocol differences:
- [Wayback] linux – rsync protocol incompatibility – Server Fault
- [Wayback] rsync(1) man page
- [Wayback] rsync NEWS (with release history)
- [Wayback] linux – Trouble negotiating rsync protocol versions – Super User
- [Wayback] bash – Rsync seems incompatible with .bashrc (causes “is your shell clean?”) – Server Fault
With --protocol=29
, time estimation is way off, but there are no errors:
time rsync -aiv --info=progress2 --progress --partial --existing --inplace --protocol=29 /vmfs/volumes/Samsung850-2TB-S3D4NX0HA01043L/ Samsung850-2TB-S3D4NX0HA01043L/ building file list ... 9059 files to consider .d..t...... isos/ 27,593 0% 0.00kB/s 0:00:06 (xfr#1, to-chk=0/9059) sent 212,594 bytes received 268 bytes 20,272.57 bytes/sec total size is 3,055,677,645,398 speedup is 14,355,204.99 real 0m 13.31s user 0m 1.35s sys 0m 0.00s time /vmfs/volumes/5791a3e1-0b9368de-4965-0cc47aaa9742/local-bin/rsync -aiv --info=progress2 --progress --partial --ignore-existing --sparse --protocol=29 /vmfs/volumes/Samsung850-2TB-S3D4NX0HA01043L/ Samsung850-2TB-S3D4NX0HA01043L/ building file list ... 9059 files to consider >f+++++++++ isos/EN-Windows-XP-SP3-VL.iso ... cd+++++++++ ESXi65.filesystem-root/usr/share/ 216,868,164,639 7% 40.64MB/s 1:24:48 (xfr#2571, to-chk=0/9059) sent 216,894,938,870 bytes received 57,858 bytes 42,582,702.80 bytes/sec total size is 3,055,677,645,398 speedup is 14.09 real 1h 24m 58s user 34m 5.59s sys 0m 0.00s Finished
Even not on ESXi, there were just a few people bumping into this, so I wonder why there are so few matches on [Wayback] “ERROR: out of memory in flist_expand [sender]” “sender=3.1” – Google Search:
- RHEL rsync.x86_64 3.1.2-6.el7_6.1 [Wayback] 1751822 – rsync failing with out of memory in flist_expand (Redhat denied fix effort because “Red Hat Enterprise Linux version 7 entered the Maintenance Support 1 Phase in August 2019”)
- Cygwin rsync 3.0.9:
–jeroen