Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs receive of a zfs send -c (compressed) sendstream causes heap corruption #557

Closed
rottegift opened this issue Mar 4, 2017 · 6 comments

Comments

@rottegift
Copy link
Contributor

"zfs send -c foo/bar@war | zfs recv" results in arc_free_data_buf() using the wrong size in the call to zio_data_buf_free(data, size).

This will result in a panic when the size difference is large enough that a different zio_data_buf_cache is chosen.

The panic will be :

panic(cpu 0 caller 0xffffff7fa9a84bfc): "kernel heap corruption detected"@spl-kmem.c:999

although the site of the panic will vary depending on KMF debugging options. An example stack is further below.

KMF_AUDIT debugging on the zio_data_buf_cache kmem_caches results in pre-panic logging like this :

Mar  1 03:23:23 cla kernel[0]: SPL: kernel memory allocator: buffer freed to wrong cache
Mar  1 03:23:23 cla kernel[0]: SPL: buffer was allocated from zio_data_buf_86016,
Mar  1 03:23:23 cla kernel[0]: SPL: caller attempting free to zio_data_buf_131072.
Mar  1 03:23:23 cla kernel[0]: SPL: buffer=<ptr>  bufctl=<ptr>  cache: zio_data_buf_131072
Mar  1 03:23:23 cla kernel[0]: SPL: previous transaction on buffer <ptr>:
Mar  1 03:23:23 cla kernel[0]: SPL: thread=<ptr>  time=T-2.038085927  slab=<ptr>  cache: zio_data_buf_86016
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.spl : _kmem_cache_alloc_debug + 0x3f0
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.spl : _kmem_cache_alloc + 0x503
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.spl : _spl_zio_kmem_cache_alloc + 0x25
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.zfs : _arc_get_data_buf + 0x4db
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.zfs : _arc_hdr_alloc_pdata + 0x17a
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.zfs : _arc_hdr_alloc + 0x283
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.zfs : _arc_alloc_compressed_buf + 0x113
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.zfs : _arc_loan_compressed_buf + 0x23
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.zfs : _dmu_recv_stream + 0xb09
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.zfs : _zfs_ioc_recv + 0xa64
Mar  1 03:23:23 cla kernel[0]: SPL: net.lundman.zfs : _zfsdev_ioctl + 0x41d
Mar  1 03:23:23 cla kernel[0]: SPL: mach_kernel : _spec_ioctl + 0x112
Mar  1 03:23:23 cla kernel[0]: SPL: mach_kernel : _VNOP_IOCTL + 0x16e
Mar  1 03:23:23 cla kernel[0]: SPL: mach_kernel : _vn_pathconf + 0xeaa
Mar  1 03:23:23 cla kernel[0]: SPL: arc.c arc_share_buf 2787 : ASSERTION(arc_can_share(hdr, buf)) failed
Mar  1 03:23:24 cla kernel[0]: SPL: kernel memory allocator: buffer freed to wrong cache
Mar  1 03:23:24 cla kernel[0]: SPL: buffer was allocated from zio_data_buf_61440,
Mar  1 03:23:23 cla kernel[0]: SPL: mach_kernel : _VNOP_IOCTL + 0x16e
Mar  1 03:23:23 cla kernel[0]: SPL: mach_kernel : _vn_pathconf + 0xeaa
Mar  1 03:23:23 cla kernel[0]: SPL: arc.c arc_share_buf 2787 : ASSERTION(arc_can_share(hdr, buf)) failed
Mar  1 03:23:24 cla kernel[0]: SPL: kernel memory allocator: buffer freed to wrong cache
Mar  1 03:23:24 cla kernel[0]: SPL: buffer was allocated from zio_data_buf_61440,
Mar  1 03:23:24 cla kernel[0]: SPL: caller attempting free to zio_data_buf_131072.
Mar  1 03:23:24 cla kernel[0]: SPL: buffer=<ptr>  bufctl=<ptr>  cache: zio_data_buf_131072
Mar  1 03:23:24 cla kernel[0]: SPL: previous transaction on buffer <ptr>:
Mar  1 03:23:24 cla kernel[0]: SPL: thread=<ptr>  time=T-1.819404063  slab=<ptr>  cache: zio_data_buf_61440
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.spl : _kmem_cache_alloc_debug + 0x3f0
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.spl : _kmem_cache_alloc + 0x7b9
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.spl : _spl_zio_kmem_cache_alloc + 0x330
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.zfs : _arc_get_data_buf + 0x4db
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.zfs : _arc_hdr_alloc_pdata + 0x17a
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.zfs : _arc_hdr_alloc + 0x283
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.zfs : _arc_alloc_compressed_buf + 0x113
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.zfs : _arc_loan_compressed_buf + 0x23
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.zfs : _dmu_recv_stream + 0xb09
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.zfs : _zfs_ioc_recv + 0xa64
Mar  1 03:23:24 cla kernel[0]: SPL: net.lundman.zfs : _zfsdev_ioctl + 0x41d
Mar  1 03:23:24 cla kernel[0]: SPL: mach_kernel : _spec_ioctl + 0x112
Mar  1 03:23:24 cla kernel[0]: SPL: mach_kernel : _VNOP_IOCTL + 0x16e
Mar  1 03:23:24 cla kernel[0]: SPL: mach_kernel : _vn_pathconf + 0xeaa
Mar  1 03:23:24 cla kernel[0]: SPL: mach_kernel : _fo_ioctl + 0x7b

DEBUG panic stack with compile-time inlining reduced :

Wed Mar  1 03:59:23 2017

*** Panic Report ***
panic(cpu 0 caller 0xffffff7fa9a84bfc): "kernel heap corruption detected"@spl-kmem.c:999
Backtrace (CPU 0), Frame : Return Address
0xffffff92da4ef1d0 : 0xffffff80292d7b92 mach_kernel : _panic + 0xe2
0xffffff92da4ef250 : 0xffffff7fa9a84bfc net.lundman.spl : _kmem_error + 0x6bc
0xffffff92da4ef350 : 0xffffff7fa9a83e24 net.lundman.spl : _kmem_cache_free_debug + 0xd4
0xffffff92da4ef3b0 : 0xffffff7fa9a83c39 net.lundman.spl : _kmem_cache_free + 0x269
0xffffff92da4ef410 : 0xffffff7faae1d5af net.lundman.zfs : _zio_data_buf_free + 0xaf
0xffffff92da4ef470 : 0xffffff7faaca193c net.lundman.zfs : _arc_free_data_buf + 0x26c
0xffffff92da4ef510 : 0xffffff7faac9bd55 net.lundman.zfs : _arc_hdr_free_pdata + 0x165
0xffffff92da4ef580 : 0xffffff7faac926d4 net.lundman.zfs : _arc_hdr_destroy + 0x364
0xffffff92da4ef610 : 0xffffff7faac91dd3 net.lundman.zfs : _arc_buf_destroy + 0x1f3
0xffffff92da4ef700 : 0xffffff7faacb1e02 net.lundman.zfs : _dbuf_destroy + 0xf2
0xffffff92da4ef760 : 0xffffff7faacb9ae2 net.lundman.zfs : _dbuf_rele_and_unlock + 0x302
0xffffff92da4ef850 : 0xffffff7faacbeb44 net.lundman.zfs : _dbuf_write_done + 0xdf4
0xffffff92da4ef9d0 : 0xffffff7faac9dfc8 net.lundman.zfs : _arc_write_done + 0xc48
0xffffff92da4efb00 : 0xffffff7faae3033f net.lundman.zfs : _zio_done + 0x243f
0xffffff92da4efe50 : 0xffffff7faae23111 net.lundman.zfs : ___zio_execute + 0x491
0xffffff92da4eff10 : 0xffffff7fa9a9a661 net.lundman.spl : _taskq_thread + 0x421
0xffffff92da4effb0 : 0xffffff80293c6537 mach_kernel : _call_continuation + 0x17
@rottegift
Copy link
Contributor Author

@dankimmel - although it's a bit of a stretch, since you're looking into the guts of arc (and abd) memory flow, we found a heap corruption issue involving zfs recv on compressed streams which has been bothering our port.

The tl;dr version is that prior to abd landing I dramatically increased the number of zio_buf_%lu and zio_data_buf_%lu arenas to further segregate buffers by size, which helped with the underlying problem what our lowest vmem layer must be a client of xnu's finicky kernel allocator, and the latter does not like to exchange memory often or in small chunks.

When compressed arc landed, receiving a compressed stream can result in a zio_data_buf_free() with the wrong (and I think always too-large) size.

I bet this affects the other ports too, but is masked by the happy concidence of the (likely always smaller) bad size still fitting into the original kmem_cache.

If you have any ideas, that'd be very helpful, and we'd happily return to trying to find and fix the true source of this panic.

Stack of the panic (descending from zio_execute) and of the allocator (descending from dmu_recv_stream) is in the following issue.

#557

(PS: kmem debugging is awesomely useful).

@dankimmel
Copy link
Contributor

@rottegift Interesting, we haven't seen that problem before, and that's something that the kmem debugging facilities check for us on debug builds (so I think we'd be pretty likely to notice it). Are you using any other arguments to zfs send besides compression? (deduped stream, embedded blocks, large blocks, etc?)

@rjhind
Copy link

rjhind commented Jul 10, 2017

I received a panic on o3X 1.6.1 master (but basically the 1.6.2 tag) using send -ce | zfs receive from a single disk pool on internal SATA SSD to external pool on a USB3 disk.

I wonder if this is the same issue?

zfs_comp_send_panic.txt

@lundman
Copy link
Contributor

lundman commented Jul 10, 2017

@dankimmel We ported IllumOS'skmem over, so we enjoy the same debugging facilities, and they are excellent. However, this seems to be a problem for just us. I added the hack:

the hack

diff --git a/include/sys/arc.h b/include/sys/arc.h
index e996bd8..5eba254 100644
--- a/include/sys/arc.h
+++ b/include/sys/arc.h
@@ -131,6 +131,7 @@ struct arc_buf {
 	kmutex_t		b_evict_lock;
 	void			*b_data;
 	arc_buf_flags_t		b_flags;
+	uint64_t b_the_real_size;
 };

 typedef enum arc_buf_contents {
diff --git a/load.sh b/load.sh
index 2c55efb..823dd3e 100755
--- a/load.sh
+++ b/load.sh
@@ -1,11 +1,5 @@
 #!/bin/bash

-pids=`ps -axww|grep "[t]ail -f /var/log/system.log" | awk '{print $1;}'`
-if [ x"$pids" != x"" ]; then
-    kill $pids
-fi
-tail -f /var/log/system.log &
-
 rsync -ar module/zfs/zfs.kext/ /tmp/zfs.kext/
 rsync -ar ../spl/module/spl/spl.kext/ /tmp/spl.kext/

diff --git a/module/zfs/arc.c b/module/zfs/arc.c
index 136ef34..70e10a1 100644
--- a/module/zfs/arc.c
+++ b/module/zfs/arc.c
@@ -2007,6 +2007,7 @@ arc_buf_fill(arc_buf_t *buf, boolean_t compressed)

 			/* We need to give the buf it's own b_data */
 			buf->b_flags &= ~ARC_BUF_FLAG_SHARED;
+			buf->b_the_real_size = HDR_GET_LSIZE(hdr);
 			buf->b_data =
 			    arc_get_data_buf(hdr, HDR_GET_LSIZE(hdr), buf);
 			arc_hdr_clear_flags(hdr, ARC_FLAG_SHARED_DATA);
@@ -2015,8 +2016,10 @@ arc_buf_fill(arc_buf_t *buf, boolean_t compressed)
 			ARCSTAT_INCR(arcstat_overhead_size, HDR_GET_LSIZE(hdr));
 		} else if (ARC_BUF_COMPRESSED(buf)) {
 			/* We need to reallocate the buf's b_data */
+			VERIFY(buf->b_the_real_size == HDR_GET_PSIZE(hdr));
 			arc_free_data_buf(hdr, buf->b_data, HDR_GET_PSIZE(hdr),
 			    buf);
+			buf->b_the_real_size = HDR_GET_LSIZE(hdr);
 			buf->b_data =
 			    arc_get_data_buf(hdr, HDR_GET_LSIZE(hdr), buf);

@@ -2588,6 +2591,7 @@ arc_buf_alloc_impl(arc_buf_hdr_t *hdr, void *tag, boolean_t compressed,
 		buf->b_flags |= ARC_BUF_FLAG_SHARED;
 		arc_hdr_set_flags(hdr, ARC_FLAG_SHARED_DATA);
 	} else {
+		buf->b_the_real_size = arc_buf_size(buf);
 		buf->b_data =
 		    arc_get_data_buf(hdr, arc_buf_size(buf), buf);
 		ARCSTAT_INCR(arcstat_overhead_size, arc_buf_size(buf));
@@ -2854,6 +2858,8 @@ arc_buf_destroy_impl(arc_buf_t *buf)
 			arc_hdr_clear_flags(hdr, ARC_FLAG_SHARED_DATA);
 		} else {
 			uint64_t size = arc_buf_size(buf);
+			VERIFY(buf->b_the_real_size == size);
+			//size = buf->b_the_real_size;
 			arc_free_data_buf(hdr, buf->b_data, size, buf);
 			ARCSTAT_INCR(arcstat_overhead_size, -size);
 		}

.
and it panics quite easily from receiving a zfs send -c snapshot. In particular;

kernel`panic(str="\"VERIFY(\" \"buf->b_the_real_size == size\" \") failed\\n\"@arc.c:2861") at debug.c:459 [opt]

where it was allocated at size 13824 but freed at size 10240. You can see my commented out line which make it work, by assigning size from b_the_real_size. (Obviously not a real fix)

But the same hack on IllumOS does not trigger, so this will be an in-house problem :) We'll have to diff arc.c again, and see if any of the various size variables differ somewhere.

lundman added a commit that referenced this issue Jul 11, 2017
Issue #557

Diffed and corrected changes to upstream, found the issue with receiving
compressed snapshots, in zio.c. The rest are just to be identical to
upstream.
@lundman
Copy link
Contributor

lundman commented Jul 11, 2017

  • Please confirm
  • Close issue.

:)

@rjhind
Copy link

rjhind commented Jul 11, 2017

I've just run the same test again (30g snapshot) using -ce and it sent and received fine with the fix. Thanks.

@lundman lundman closed this as completed Jul 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants