-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed rare race condition error when creating the basetemp directory #5525
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5525 +/- ##
=======================================
Coverage 96.11% 96.11%
=======================================
Files 117 117
Lines 25695 25695
Branches 2493 2493
=======================================
Hits 24696 24696
Misses 695 695
Partials 304 304
Continue to review full report at Codecov.
|
@@ -33,7 +33,7 @@ def ensure_reset_dir(path): | |||
""" | |||
if path.exists(): | |||
rmtree(path, force=True) | |||
path.mkdir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it absolutely has to fail when doing a ensure_reset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem as I understand, is a race condition between workers trying to create the same basedir at the same time; worker 1 and 2 (for example) manage to rmtree
at the same time, but only one of them will succeed when they get to path.mkdir()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensure_dir_reset is about completely resetting a folder, if the invariant does not hold, something went baldy wrong, and 2 processes that absolutely should not use the same folder will do so
Lines 58 to 60 in 9136b3f
if self._given_basetemp is not None: | |
basetemp = self._given_basetemp | |
ensure_reset_dir(basetemp) |
it absolutely never must pass if the observed error happens - basetemp invocation itself is not concurrency save ever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, my bad, indeed basetemp
is created by the master process so there's shouldn't be a race condition there.
I'm closing this and the related issue, I will investigate some more on our side as we do pass a basetemp around. 👍
Thanks for the quick review!
Fix #5524
Probably hard/impossible to reproduce in a test, I'm afraid.