-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenZFS 9403 - assertion failed in arc_buf_destroy() when concurrentl… #7822
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7822 +/- ##
==========================================
+ Coverage 78.33% 78.42% +0.08%
==========================================
Files 374 373 -1
Lines 112924 112802 -122
==========================================
+ Hits 88459 88463 +4
+ Misses 24465 24339 -126
Continue to review full report at Codecov.
|
@@ -5803,10 +5803,11 @@ arc_getbuf_func(zio_t *zio, const zbookmark_phys_t *zb, const blkptr_t *bp, | |||
arc_buf_t **bufp = arg; | |||
|
|||
if (buf == NULL) { | |||
ASSERT(zio == NULL || zio->io_error != 0); | |||
*bufp = NULL; | |||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you accidentally dropped an ASSERT from the OpenZFS version which should be here.
ASSERT(zio == NULL || zio->io_error == 0);
There are also two additional ASSERTs which were lost and should be added to dbuf_prefetch_indirect_done()
.
ASSERT0(ret); | ||
if (zio_decompress_fail_fraction != 0 && | ||
spa_get_random(zio_decompress_fail_fraction) == 0) | ||
ret = SET_ERROR(EINVAL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After taking a second look at this I think it's worth doing this decompress error injection with zinject -e decompress
. You can model it after the existing zinject -e decrypt
support and the zio_handle_decrypt_injection
handler used in zio_decrypt
.
This makes the code consistent and allows for an easy test case by adding a new fault/decompress_fault.ksh
based on the existing fault/decrypt_fault.ksh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few remaining nits.
module/zfs/zio.c
Outdated
@@ -389,6 +389,12 @@ zio_decompress(zio_t *zio, abd_t *data, uint64_t size) | |||
zio->io_abd, tmp, zio->io_size, size); | |||
abd_return_buf_copy(data, tmp, size); | |||
|
|||
#ifdef _KERNEL | |||
printk(KERN_DEBUG "HERE1: %d %d\n", zio_injection_enabled, ret); | |||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray debug line needs to be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that.
log_onexit cleanup | ||
|
||
default_mirror_setup_noexit $DISK1 $DISK2 | ||
log_must eval "echo 0 > /sys/module/zfs/parameters/zfs_compressed_arc_enabled" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use log_must set_tunable64 zfs_compressed_arc_enabled 0
for this and the cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for letting me know that that existed. I will make sure to use it in the future.
…y reading block with checksum error Porting notes: * The ability to zinject decompression errors has been added, but this only works at the zio_decompress() level, where we have all of the inf we need to match against the user's zinject options. * The decompress_fault test has been added to test the new zinject functionality * We attempted to set zio_decompress_fail_fraction to (1 << 18) in ztest for further test coverage. Although this did uncover a few low priority issues, this unfortuantely also causes ztest to ASSERT in many locations where the code is working correctly since it is designed to fail on IO errors. Developers can manually set this variable with the '-o' option to find and debug issues. Authored by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Matt Ahrens <mahrens@delphix.com> Ported-by: Tom Caputi <tcaputi@datto.com> OpenZFS-issue: https://illumos.org/issues/9403 OpenZFS-commit: openzfs/openzfs@fa98e487a9
…y reading block with checksum error
Porting notes:
future this could be added to zinject.
Authored by: Matt Ahrens mahrens@delphix.com
Reviewed by: George Wilson george.wilson@delphix.com
Reviewed by: Paul Dagnelie pcd@delphix.com
Reviewed by: Pavel Zakharov pavel.zakharov@delphix.com
Approved by: Matt Ahrens mahrens@delphix.com
Ported-by: Tom Caputi tcaputi@datto.com
OpenZFS-issue: https://illumos.org/issues/9403
OpenZFS-commit: openzfs/openzfs@fa98e487a9
Types of changes
Checklist:
Signed-off-by
.