Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenZFS 9403 - assertion failed in arc_buf_destroy() when concurrentl… #7822

Merged
merged 1 commit into from
Aug 29, 2018

Conversation

tcaputi
Copy link
Contributor

@tcaputi tcaputi commented Aug 21, 2018

…y reading block with checksum error

Porting notes:

  • zio_decompress_fail_fraction was made into a tunable. In the
    future this could be added to zinject.

Authored by: Matt Ahrens mahrens@delphix.com
Reviewed by: George Wilson george.wilson@delphix.com
Reviewed by: Paul Dagnelie pcd@delphix.com
Reviewed by: Pavel Zakharov pavel.zakharov@delphix.com
Approved by: Matt Ahrens mahrens@delphix.com
Ported-by: Tom Caputi tcaputi@datto.com

OpenZFS-issue: https://illumos.org/issues/9403
OpenZFS-commit: openzfs/openzfs@fa98e487a9

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

@codecov
Copy link

codecov bot commented Aug 22, 2018

Codecov Report

Merging #7822 into master will increase coverage by 0.08%.
The diff coverage is 72.41%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7822      +/-   ##
==========================================
+ Coverage   78.33%   78.42%   +0.08%     
==========================================
  Files         374      373       -1     
  Lines      112924   112802     -122     
==========================================
+ Hits        88459    88463       +4     
+ Misses      24465    24339     -126
Flag Coverage Δ
#kernel 78.43% <71.42%> (-0.36%) ⬇️
#user 67.5% <68.96%> (+0.35%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 47ab01a...822739a. Read the comment docs.

@@ -5803,10 +5803,11 @@ arc_getbuf_func(zio_t *zio, const zbookmark_phys_t *zb, const blkptr_t *bp,
arc_buf_t **bufp = arg;

if (buf == NULL) {
ASSERT(zio == NULL || zio->io_error != 0);
*bufp = NULL;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you accidentally dropped an ASSERT from the OpenZFS version which should be here.

ASSERT(zio == NULL || zio->io_error == 0);

There are also two additional ASSERTs which were lost and should be added to dbuf_prefetch_indirect_done().

ASSERT0(ret);
if (zio_decompress_fail_fraction != 0 &&
spa_get_random(zio_decompress_fail_fraction) == 0)
ret = SET_ERROR(EINVAL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After taking a second look at this I think it's worth doing this decompress error injection with zinject -e decompress. You can model it after the existing zinject -e decrypt support and the zio_handle_decrypt_injection handler used in zio_decrypt.

This makes the code consistent and allows for an easy test case by adding a new fault/decompress_fault.ksh based on the existing fault/decrypt_fault.ksh.

@behlendorf behlendorf added the Status: Work in Progress Not yet ready for general review label Aug 22, 2018
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few remaining nits.

module/zfs/zio.c Outdated
@@ -389,6 +389,12 @@ zio_decompress(zio_t *zio, abd_t *data, uint64_t size)
zio->io_abd, tmp, zio->io_size, size);
abd_return_buf_copy(data, tmp, size);

#ifdef _KERNEL
printk(KERN_DEBUG "HERE1: %d %d\n", zio_injection_enabled, ret);
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray debug line needs to be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that.

log_onexit cleanup

default_mirror_setup_noexit $DISK1 $DISK2
log_must eval "echo 0 > /sys/module/zfs/parameters/zfs_compressed_arc_enabled"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use log_must set_tunable64 zfs_compressed_arc_enabled 0 for this and the cleanup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for letting me know that that existed. I will make sure to use it in the future.

@behlendorf behlendorf removed the Status: Work in Progress Not yet ready for general review label Aug 26, 2018
…y reading block with checksum error

Porting notes:
* The ability to zinject decompression errors has been added, but
  this only works at the zio_decompress() level, where we have all
  of the inf we need to match against the user's zinject options.
* The decompress_fault test has been added to test the new zinject
  functionality
* We attempted to set zio_decompress_fail_fraction to (1 << 18) in
  ztest for further test coverage. Although this did uncover a few
  low priority issues, this unfortuantely also causes ztest to
  ASSERT in many locations where the code is working correctly since
  it is designed to fail on IO errors. Developers can manually set
  this variable with the '-o' option to find and debug issues.

Authored by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Ported-by: Tom Caputi <tcaputi@datto.com>

OpenZFS-issue: https://illumos.org/issues/9403
OpenZFS-commit: openzfs/openzfs@fa98e487a9
@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Aug 29, 2018
@behlendorf behlendorf merged commit c3bd3fb into openzfs:master Aug 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants