-
Notifications
You must be signed in to change notification settings - Fork 619
Fix strategy validation thread-safety #4473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix strategy validation thread-safety #4473
Conversation
9378c6f
to
42299bb
Compare
|
||
with validate_lock: | ||
try: | ||
self.validate_called = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should validate_called = True
be moved below do_validate()
, given the early return outside lock on l.486?
That would leave open the possibility of do_validate
called multiple time in an initialization race (unless explicitly checked inside), probably harmless though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... per comment, this isn't sufficient. Maybe the early return needs to move inside critical region, or if that is too expensive possibly a two-stage process, i.e. early-return like
if self.validate_called and not self.validate_in_progress:
return
I'm not thinking clearly right now, so please consider this a hint not a recipe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.validate_called and not self.validate_in_progress:
is also going to run into an infinite recursion, but possibly what we could do is set validate_calleds: dict[int, bool]
which tracks thread_id: validate_called
, and only return if validate has been called on this thread before. This would be lock-free and allow for concurrent validates. If threading.get_ident()
is not expensive then this could work. I'll test it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside of this is that all threads rerun validation for all strategies. It's a tradeoff between singlethreaded and multithreaded performance. I'm defaulting to prioritizing singlethreaded performance, but I could see us changing this in the future (or now, if people have preferences)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to keep prioritizing single-threaded perf for now, at least when it's not super-lopsided.
Potential caveat in this case: is it safe to concurrently run validation for a strategy in multiple threads? If not, we should probably go with the smallish cost of a lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concurrent validation should be safe, yeah. (in theory, and I've tested a full run in practice)
I went with the approach described in #4473 (comment). Benchmark:
![]() |
In the next version of `hypothesis` subclasses of `hypothesis.strategies.SearchStrategy` will be required to call `super().__init__()` in their `__init__` method (HypothesisWorks/hypothesis#4473). This PR addresses this in the two subclasses in our codebase: `CFTimeStrategy` and `CFTimeStrategyISO8601`. Apparently this kind of subclassing is not actually part of the public API ([link](https://github.com/HypothesisWorks/hypothesis/pull/4473/files#diff-9abc0311b216f25f0b71cfff6b7043b22071d09a58cb949f6bc5022ddeaa8e7f)), so maybe we should adjust the approach here long term, but this at least gets the tests passing for now. - [x] Closes #10541
Part of #4451. This one is pretty rare, only triggering about once per full test suite run under
--parallel-threads 2
. There might be a way to writedo_validate
that takes recursion into account and only tracksvalidate_called = True
at the end, but that's a bigger rewrite.I'm somewhat worried about the unconditional lock overhead in the singlethreaded case. It's ~pure lock/unlock overhead, I expect ~zero contention. Here's a claude benchmark for
.validate
forlambda i: st.integers().map(lambda x: x + i))
, where thelambda i
and.map
are cache-busters so each strategy really gets.validate
called.Details
Around 7% slower. Not great, since I've seen
.validate
be a performance hotspot before.The following command reproduces the relevant failure on master:
counter=1; while pytest hypothesis-python/tests/ --parallel-threads 2 -k test_invalid_args; do ((counter++)); done;