-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block cloning conditionally destroy ARC buffer #16337
Conversation
6a55f0c
to
330b0e7
Compare
330b0e7
to
e463c2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still looks OK to me. I am not sure I like running heavy I/O test for 3 minutes each time to test for one scenario though. Tests already take too much time. But I'll leave it to somebody who is closer to the tests area to comment. I wonder, since we know where was the problem, could we try to craft something more specific to trigger it?
No disagreement from me that running a test for 3 mins is not great... I have direct I/O tests that could also trigger this, but often times it would take running them for 100+ iterations to get it to hit. It is hard to get things to trigger with timing stuff like this. This is also why I ask @ixhamza for a test case he might supply us that was a good case that could easily duplicate the issue. We might be able to craft something for sure. I just wonder if we will get into the same dilemma of it being such a timing thing to trigger, we would still wind up having to run multiple iterations of it. Open to ideas though for a better reproducer if we could be craft one. |
Looks like the new test case timed out on FreeBSD 13:
|
e463c2b
to
a7ba93c
Compare
Yeah, look at the output the test never made it back doing the first sync of the pool. I played around with this a bit in a FreeBSD VM with little resources to mimic the CI runners. The vast majority of the time was spent just doing the initial |
Looking at the rest results it seems like for every ~20 clonefiles/sync:
..there's 1 dd:
is that ok? |
dmu_buf_will_clone() calls arc_buf_destroy() if there is an assosciated ARC buffer with the dbuf. However, this can only be done conditionally. If the preivous dirty record's dr_data is pointed at db_dbf then destroying it can lead to NULL pointer deference when syncing out the previous dirty record. This updates dmu_buf_fill_clone() to only call arc_buf_destroy() if the previous dirty records dr_data is not pointing to db_buf. The block clone wil still set the dbuf's db_buf and db_data to NULL, but this will not cause any issues as any previous dirty record dr_data will still be pointing at the ARC buffer. Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
a7ba93c
to
d3bb628
Compare
I did some local testing as well on a node I have, and I was seeing about the same thing. I just decided to remove the test case. It has been locally tested with the original test case by @ixhamza to show. without this patch the error occurs. We might be able to come up with a test case in the future that would be better at stressing this, but I don't think we should hold off merging a NULL pointer dereference based on a test case. |
dmu_buf_will_clone() calls arc_buf_destroy() if there is an associated ARC buffer with the dbuf. However, this can only be done conditionally. If the previous dirty record's dr_data is pointed at db_dbf then destroying it can lead to NULL pointer deference when syncing out the previous dirty record. This updates dmu_buf_fill_clone() to only call arc_buf_destroy() if the previous dirty records dr_data is not pointing to db_buf. The block clone wil still set the dbuf's db_buf and db_data to NULL, but this will not cause any issues as any previous dirty record dr_data will still be pointing at the ARC buffer. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes openzfs#16337
dmu_buf_will_clone() calls arc_buf_destroy() if there is an assosciated ARC buffer with the dbuf. However, this can only be done conditionally. If the preivous dirty record's dr_data is pointed at db_dbf then destroying it can lead to NULL pointer deference when syncing out the previous dirty record.
This updates dmu_buf_fill_clone() to only call arc_buf_destroy() if the previous dirty records dr_data is not pointing to db_buf. The block clone wil still set the dbuf's db_buf and db_data to NULL, but this will not cause any issues as any previous dirty record dr_data will still be pointing at the ARC buffer.
Updated
dmu_buf_will_clone()
to conditionally callarc_buf_destroy()
.Motivation and Context
dmu_buf_will_clone()
always calledarc_buf_destroy()
if there was an associated ARC buffer with the dbuf. However, this can lead to aNULL
pointer dereference, which can occur when a previous dirty record is being synced and it'sdr_data
is pointing at the ARC buffer also pointed to bydb_buf
.Description
This updates
dmu_buf_fill_clone()
to only callarc_buf_destroy()
if the previous dirty records dr_data is not pointing to db_buf. The block clone wil still set the dbuf's db_buf and db_data toNULL
, but this will not cause any issues as any previous dirty record dr_data will still be pointing at the ARC buffer.How Has This Been Tested?
Ran ZTS tests with
bclone
andblock_cloning
takes for 5 iterations without any issues.Testing was done with kernel 4.18.0-408.el8.x86_64.
Types of changes
Checklist:
Signed-off-by
.