Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix dsl_scan_ds_clone_swapped logic #9163

Merged
merged 2 commits into from
Sep 18, 2019

Conversation

avg-I
Copy link
Contributor

@avg-I avg-I commented Aug 14, 2019

The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued. A duplicate insertion would be attempted and
its failure would result in a panic.

This change should fix #9140.

Signed-off-by: Andriy Gapon avg@FreeBSD.org

@avg-I
Copy link
Contributor Author

avg-I commented Aug 14, 2019

Note that I was not sure if mintxg1 and mintxg2 are required to be equal (as they are for the ZAP).
If that's the case, then the code could be made simpler.

@avg-I
Copy link
Contributor Author

avg-I commented Aug 14, 2019

CC @tcaputi

@tcaputi
Copy link
Contributor

tcaputi commented Aug 14, 2019

The first part of the fix (dealing with scan_ds_queue_contains()) looks good to me. For the second part, don't we need similar handling for when both the clone and origin are in the zap?

@avg-I
Copy link
Contributor Author

avg-I commented Aug 14, 2019

I think that that is completly handled in the first branch. It handles two sub-cases: (a) only ds1 is in the ZAP; (b) both ds1 and ds2 are in the ZAP. So, the second branch needs to deal only with the case where only ds2 is in the ZAP.

@tcaputi
Copy link
Contributor

tcaputi commented Aug 14, 2019

Ah, I see. I think it would be good to break that up similarly to how you did with scan_ds_queue_contains(). This way its easy to see that these code paths are doing very similar things.

@avg-I
Copy link
Contributor Author

avg-I commented Aug 14, 2019

Okay, I will do that!
Just want to clarify first if mintxg-s can ever be different between the origin and the clone or if they must always be the same.
Thanks!

@behlendorf behlendorf requested a review from tcaputi August 15, 2019 23:28
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Aug 15, 2019
@tcaputi
Copy link
Contributor

tcaputi commented Aug 16, 2019

I'm not sure about the mintxgs. I would assume they can be different.

@avg-I
Copy link
Contributor Author

avg-I commented Aug 17, 2019

It's just that the code dealing with the ZAP asserts that ds1's mintxg is equal to ds1's and ds2's ds_prev_snap_txg (and ds_prev_snap_txg -s must obviously be equal to each other).
And also the same is asserted for ds2's mintxg.

So, I guessed that mintxg-s recorded in the in-memory queue must obey the same rule.
Maybe @ahrens can help us here.

@ahrens
Copy link
Member

ahrens commented Aug 21, 2019

@avg-I That's right, since we are doing a clone-swap, ds1 and ds2 have the same prev snap, and their mintxgs in the on-disk queue (scn_queue_obj) should be the same as that prev snap's birth. I'm not as familiar with the in-memory queue.

@avg-I avg-I force-pushed the 9140-dsl_scan_ds_clone_swapped branch from a0978af to 2f66a09 Compare August 26, 2019 13:11
module/zfs/dsl_scan.c Show resolved Hide resolved
module/zfs/dsl_scan.c Show resolved Hide resolved
@@ -2184,52 +2184,57 @@ dsl_scan_ds_clone_swapped(dsl_dataset_t *ds1, dsl_dataset_t *ds2, dmu_tx_t *tx)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could use some comments explaining that we need to apply this transformation to both the in-memory and on-disk queues, and since the state of the in-memory and on-disk queues might be different, we need to apply the transformation independently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But I would prefer to let someone who is more familiar with the code do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrens, I am just not entirely sure if the in-memory and on-disk queues can indeed be different in this respect.
If they can be different, then I need to understand why / how.
I need some help with writing the requested comment.
Maybe @tcaputi could provide it.

ASSERT3U(mintxg, ==, dsl_dataset_phys(ds2)->ds_prev_snap_txg);
VERIFY3U(0, ==, zap_remove_int(dp->dp_meta_objset,
if (ds1_queued && ds2_queued) {
/* Nothing to do if both are queued (or both are not queued). */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use the same algorithm for the on-disk and in-memory queues. I.e. make both of them:

if (ds1_queued) {
  remove ds1, add ds2
}
if (ds2_queued) {
  remove ds2, add ds1
}

or make both of them

if (ds1_queued && ds2_queued) {
  // nothing
} else if (ds1_queued) {
  remove ds1, add ds2
} else if (ds2_queued) {
  remove ds2, add ds1
}

To me, the first way is easier to understand, but either works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I* structured them differently is that I was not sure if mintxg values obey the same rules for both queues. I will change them both to the second form (because of the issue with the first form).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. It would be nice to augment that comment explaining why the "both queued" case is needed (so that we don't try to add something that's already there).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrens, I am not sure how to fit the requested comment into the new control flow.
I mean that removing "both queued" case would not only result in the duplicate insertion, but it would also result in obviously incorrect logic where we swap just one ds but not the other (because of else-if).
In other words, I am not sure how to word a comment that would explain why the code does not do an obviously wrong thing.

@ahrens ahrens added the Type: Defect Incorrect behavior (e.g. crash, hang) label Aug 26, 2019
@avg-I avg-I force-pushed the 9140-dsl_scan_ds_clone_swapped branch from 2f66a09 to 5bd8f78 Compare August 29, 2019 09:52
@codecov
Copy link

codecov bot commented Aug 30, 2019

Codecov Report

Merging #9163 into master will decrease coverage by 0.18%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9163      +/-   ##
==========================================
- Coverage   79.22%   79.04%   -0.19%     
==========================================
  Files         400      400              
  Lines      122041   122054      +13     
==========================================
- Hits        96687    96472     -215     
- Misses      25354    25582     +228
Flag Coverage Δ
#kernel 79.72% <0%> (-0.1%) ⬇️
#user 66.66% <0%> (-0.27%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a49dbbb...7b8781a. Read the comment docs.

The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

This change should fix openzfs#9140.

Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
@avg-I avg-I force-pushed the 9140-dsl_scan_ds_clone_swapped branch from 5bd8f78 to 1501c95 Compare September 4, 2019 07:34
@behlendorf behlendorf requested a review from ahrens September 11, 2019 00:10
@behlendorf
Copy link
Contributor

@tcaputi would you mind reviewing this.

Copy link
Contributor

@tcaputi tcaputi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if you can explain (and possibly comment in the code) as to how those sanity checks work.

if (ds2_queued) {
ASSERT3U(mintxg2, ==, dsl_dataset_phys(ds1)->ds_prev_snap_txg);
ASSERT3U(mintxg2, ==, dsl_dataset_phys(ds2)->ds_prev_snap_txg);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the sanity check here (and below?). If both are queued this looks impossible to me unless mintxg1 == mintxg2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ds1 and ds2 have the same ds_prev (see assertion in the beginning of dsl_dataset_clone_swap_sync_impl()). So if mintxgX is the prev snap txg (as asserted here), then mintxg1==mintxg2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that all mintxg-s must be equal and the asserts are implicitly asserting that.

@@ -2184,52 +2184,75 @@ dsl_scan_ds_clone_swapped(dsl_dataset_t *ds1, dsl_dataset_t *ds2, dmu_tx_t *tx)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we're here can you change the comment above the function to say origin instead of parent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a simple s/parent/origin/ would not produce a good result.
Also, it seems that there is such a term as "clone parent-child dependency relationship", so "parent" is not completely out of place. And an unqualified "origin" seems to mean an origin snapshot.
Anyway, "parent" is more ambiguous than "origin" in this context, so I'll try to improve that comment.

Copy link
Member

@ahrens ahrens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to see the comments I mentioned added, but I'm OK with the change as-is.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Sep 12, 2019
@behlendorf
Copy link
Contributor

@avg-I can I twist your arm to add the requested additional comments. Then we can get this integrated.

@avg-I
Copy link
Contributor Author

avg-I commented Sep 14, 2019

@behlendorf , I can add the comments requested by Matt, but I genuinely cannot understand what they should say. (And in one case why it's needed at all).
So, if someone else can write them or can help me write them, then I can add them.

Copy link
Member

@ahrens ahrens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some specific suggestions for the comments.

}

if (ds1_queued && ds2_queued) {
/* Nothing to do if both are queued (or both are not queued). */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing this comment to:
If both are queued, we don't need to do anything. The swapping code below would not handle this case correctly, since we can't insert ds2 if it is already there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion! I will take it.
At the same time it's not the only problem with the hypothetical alternative mentioned in the comment. It would attempt to replace ds1 with ds2 but not in the opposite direction because of else if. And without the else if there would be another problem: replacing ds1 with ds2 and then ds2 back with ds1.
So, in my opinion, even if the queue silently ignored duplicate inserts, it would be very hard (if possible) to handle the case where both ds1 and ds2 are queued without explicitly handling that case in one way or another.
E.g., the proposed code shows one way; the existing code for the on-disk queue shows another: having the enqueue operation return an error code, checking for an error, and if the error is EEXIST, then backing out the original removal. Which is more work than just doing nothing in the first place.

ASSERT3U(mintxg, ==, dsl_dataset_phys(ds2)->ds_prev_snap_txg);
VERIFY3U(0, ==, zap_remove_int(dp->dp_meta_objset,
/*
* Handle the on-disk scan queue.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding this to the comment:
The on-disk state is an out-of-date version of the in-memory state, so the in-memory and on-disk values for ds1_queuedandds2_queued may be different. Therefore we need to apply the swap logic to the on-disk state independently of the in-memory state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
@behlendorf behlendorf merged commit dd262c9 into openzfs:master Sep 18, 2019
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 18, 2019
The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes openzfs#9140 
Closes openzfs#9163
ahrens pushed a commit to ahrens/zfs that referenced this pull request Sep 18, 2019
The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes openzfs#9140 
Closes openzfs#9163
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 18, 2019
The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes openzfs#9140 
Closes openzfs#9163
@avg-I avg-I deleted the 9140-dsl_scan_ds_clone_swapped branch September 19, 2019 11:27
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 19, 2019
The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes openzfs#9140 
Closes openzfs#9163
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 23, 2019
The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes openzfs#9140 
Closes openzfs#9163
tonyhutter pushed a commit that referenced this pull request Sep 26, 2019
The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #9140
Closes #9163
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested) Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

suspicious code in dsl_scan_ds_clone_swapped, related to sequential scan
4 participants