Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dedup=verify doesn't clear the blkptr's dedup flag #8936

Merged
merged 1 commit into from
Jun 21, 2019

Conversation

ahrens
Copy link
Member

@ahrens ahrens commented Jun 19, 2019

Motivation and Context

Description

The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

External-issue: DLPX-48097
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jun 19, 2019
@codecov
Copy link

codecov bot commented Jun 20, 2019

Codecov Report

Merging #8936 into master will increase coverage by 12.5%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8936      +/-   ##
==========================================
+ Coverage   66.26%   78.77%   +12.5%     
==========================================
  Files         324      382      +58     
  Lines      103014   117842   +14828     
==========================================
+ Hits        68261    92827   +24566     
+ Misses      34753    25015    -9738
Flag Coverage Δ
#kernel 79.28% <0%> (?)
#user 67.62% <0%> (+1.36%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fb0be12...00a7c2c. Read the comment docs.

@behlendorf behlendorf requested a review from tcaputi June 20, 2019 19:51
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it may be the long standing root cause which prompted this non-ideal change 5dc6af0. If that's the case, it's worth considering reverting that commit. Since this is such as unlikely code path to hit it'd be nice to somehow add a test case which exercises it.

Copy link
Contributor

@tcaputi tcaputi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although the change is correct, i very much doubt that it has ever caused an issue in the real world. As the comment above states, if we hit this code we should be automatically emailing Nature.

@ahrens
Copy link
Member Author

ahrens commented Jun 20, 2019

Yeah, we actually hit this code path due to a different bug, that caused us to think the two blocks were different, when they weren't really. That bug was caused by abd_cmp() comparing the entire ABD when we only wanted to compare the first part of it. In the current code, I couldn't figure out how that would happen, and presumably if it was happening then we'd be hitting this bug. So I didn't include that code change, although I could - it's basically to pass zio->io_size as an additional parameter to abd_cmp() (as is done on illumos).

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional explanation, I'm happy to go with the PR as-is.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 20, 2019
@behlendorf behlendorf merged commit accd6d9 into openzfs:master Jun 21, 2019
@ahrens ahrens deleted the DLPX-48097 branch June 21, 2019 03:27
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 13, 2019
The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-48097
Closes openzfs#8936
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 22, 2019
The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-48097
Closes openzfs#8936
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 23, 2019
The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-48097
Closes openzfs#8936
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 17, 2019
The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-48097
Closes openzfs#8936
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 18, 2019
The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-48097
Closes openzfs#8936
tonyhutter pushed a commit that referenced this pull request Sep 26, 2019
The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-48097
Closes #8936
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants