-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-deterministic send-stream produced if Embedded Blocks feature is enabled (Proxmox, Ubuntu, FreeBSD, but not Arch) #13778
Comments
Reproduced on FreeBSD too now:
This one has psize=29, and you can see that the non-zero garbage bytes at the end of the block there are exactly the three padding bytes that round 29 up to the next multiple of 8 (32). |
We allocate the buffer for the embedded block on the stack, and we don't zero it before use. The unpacking code also doesn't zero anything after the tail. We send unused bytes because we round the payload size up to a multiple of 8. This should be easy to fix; we can just zero the padding bytes after the call to |
This fixes a kernel stack leak. Closes openzfs#13778 Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
@thenickdude If you have a chance, would you verify that #14255 fixes this? |
After installing your commit on Ubuntu 22.04, no changes in the send-stream were detected within 10,000 trials using the repro script I provided earlier (previously each I also tested it by filling a pool with the ZFS sourcecode like so, and sending and receiving that snapshot and manually diffing them to verify that the send-stream was not blatantly corrupted by the patch: truncate -s 2G /tmp/zpooltest
zpool create test /tmp/zpooltest
git clone https://github.com/openzfs/zfs /test/zfs
zfs snapshot test@here
zfs send -e test@here | zfs recv test/recv
diff -r --no-dereference /test/zfs /test/recv/zfs No differences in the sent and received datasets were detected. I verified that this testcase did in fact exercise the problematic $ zfs send -e test@here | zstreamdump -d | grep WRITE_EMBEDDED
...
Total DRR_WRITE_EMBEDDED records = 284 (20024 bytes) In the original unpatched ZFS version, every |
Awesome. Would it be alright for me to add you to the commit message in a Tested-by line? |
Yep, sure. |
This fixes a kernel stack leak. Closes openzfs#13778 Tested-by: Nicholas Sherlock <n.sherlock@gmail.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
This fixes a kernel stack leak. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Tested-by: Nicholas Sherlock <n.sherlock@gmail.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes openzfs#13778 Closes openzfs#14255
This fixes a kernel stack leak. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Tested-by: Nicholas Sherlock <n.sherlock@gmail.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes openzfs#13778 Closes openzfs#14255
This fixes a kernel stack leak. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Tested-by: Nicholas Sherlock <n.sherlock@gmail.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes openzfs#13778 Closes openzfs#14255
This fixes a kernel stack leak. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Tested-by: Nicholas Sherlock <n.sherlock@gmail.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes openzfs#13778 Closes openzfs#14255
System information
I reproduced the bug on these systems:
The bug doesn't seem to impact this system:
If I
zfs send
the same snapshot multiple times with the embedded blocks feature enabled (e.g. using --raw or -e), the checksum for the stream randomly varies, because bytes in the stream randomly vary. This doesn't occur when this flag is disabled.Running this script reproduces the problem:
On Proxmox my result is:
On Ubuntu 20.04 the problem occurs more rarely (it happens in about 1/64 trials), I had to re-run the send part of the test script several times to catch one, here is one such run:
On Ubuntu 22.04 I get this result:
You can see that when
-e
or--raw
is used (which also enables -e), the checksum of the produced send stream randomly varies, but when this option is absent the stream is always consistent.If I pipe the differing streams to "zstreamdump -d", I find that the 5th-last byte of a WRITE_EMBEDDED object randomly varies between streams. Stream1:
Stream2:
It seems to me that the padding at the end of this block contains random uninitialised memory, which causes WRITE_EMBEDDED objects to vary from send to send.
No other bytes in the stream change from run to run (ignoring the stream checksums, which differ accordingly).
However if I run this test on Arch Linux the trailing bytes of the WRITE_EMBEDDED objects are all zeros, and the stream checksum never varied in 1000 trials:
So it seems like the end of the block is being properly zeroed on Arch Linux, but not on Proxmox (Debian with Ubuntu-based kernel) or Ubuntu.
The text was updated successfully, but these errors were encountered: