-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs receive of a zfs send -c (compressed) sendstream causes heap corruption #557
Comments
@dankimmel - although it's a bit of a stretch, since you're looking into the guts of arc (and abd) memory flow, we found a heap corruption issue involving zfs recv on compressed streams which has been bothering our port. The tl;dr version is that prior to abd landing I dramatically increased the number of zio_buf_%lu and zio_data_buf_%lu arenas to further segregate buffers by size, which helped with the underlying problem what our lowest vmem layer must be a client of xnu's finicky kernel allocator, and the latter does not like to exchange memory often or in small chunks. When compressed arc landed, receiving a compressed stream can result in a zio_data_buf_free() with the wrong (and I think always too-large) size. I bet this affects the other ports too, but is masked by the happy concidence of the (likely always smaller) bad size still fitting into the original kmem_cache. If you have any ideas, that'd be very helpful, and we'd happily return to trying to find and fix the true source of this panic. Stack of the panic (descending from zio_execute) and of the allocator (descending from dmu_recv_stream) is in the following issue. (PS: kmem debugging is awesomely useful). |
@rottegift Interesting, we haven't seen that problem before, and that's something that the |
I received a panic on o3X 1.6.1 master (but basically the 1.6.2 tag) using send -ce | zfs receive from a single disk pool on internal SATA SSD to external pool on a USB3 disk. I wonder if this is the same issue? |
@dankimmel We ported IllumOS's the hack
.
where it was allocated at size But the same hack on IllumOS does not trigger, so this will be an in-house problem :) We'll have to diff |
Issue #557 Diffed and corrected changes to upstream, found the issue with receiving compressed snapshots, in zio.c. The rest are just to be identical to upstream.
:) |
I've just run the same test again (30g snapshot) using -ce and it sent and received fine with the fix. Thanks. |
"zfs send -c foo/bar@war | zfs recv" results in arc_free_data_buf() using the wrong size in the call to zio_data_buf_free(data, size).
This will result in a panic when the size difference is large enough that a different zio_data_buf_cache is chosen.
The panic will be :
although the site of the panic will vary depending on KMF debugging options. An example stack is further below.
KMF_AUDIT debugging on the zio_data_buf_cache kmem_caches results in pre-panic logging like this :
DEBUG panic stack with compile-time inlining reduced :
The text was updated successfully, but these errors were encountered: