-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
poor observability when hypothesis rejects data due to MAX_DEPTH #3643
Comments
Sounds great! Grepping through our uses of
Since the depth in question doesn't correspond to anything in our user-facing API - in particular, it's not exactly the same as the nesting depth of the produced examples - I'd prefer to keep this as private internals, and work out a better value or how to auto-configure it without needing user intervention. |
Related issues: Zac-HD/hypothesmith#28, Zac-HD/hypothesmith#30 (comment). My limited free time permitting, I'm keen to help get better tools for source-code generation working - there's clearly ongoing uses for them 😀 |
Yes! Will do. |
Opened the PR for more events on invalidation.
I'd like to continue this discussion with some more data from my comprehension generator. Would you prefer if we just continue in this issue (to keep all the above context) or that I open a new issue, so this one is just about the observability? |
Let's go with a new issue, so that this one can be closed by your PR. (for which, thanks!) |
Closed by #3654 ❤️🎉 |
I wrote a "comprehension fuzzer" using hypothesis to fuzz the CPython compiler, specifically on list comprehensions: https://github.com/carljm/compgenerator
This is part of debugging/stabilizing the implementation of PEP 709. The fuzzer contains strategies (inspired by Hypothesmith, but not using it) to generate a valid and constrained Python AST containing comprehensions, and then test its compilability and output in a control and test build of CPython.
I observed that the fuzzer is very slow because it rejects something like 98% of samples it generates. I've done some work in carljm/compgenerator#8 to try to reduce the visible reasons for rejection (my own uses of
.filter(...)
), but I still end up with stats looking like this:(If it's useful to reproduce this, use carljm/compgenerator#8 and, after
pip install -r frozen.txt
in a Python 3.11 venv, runpytest --hypothesis-show-statistics --hypothesis-verbosity=debug -s --run-evalserver
. Note that this does evaluate fuzzed AST samples, but the samples are quite constrained; no imports, no calls, no use of any identifier other than__a
and__b
, which means no references to builtins, so it should be reasonably safe.)I found this stats output very confusing, because I still have ~99% sample rejection, but basically none of it is coming from any reported events or filtering.
After some debugging (via hypothesis "debug" level verbosity and code spelunking in hypothesis internals), I eventually realized that all of the rejections are coming from https://github.com/HypothesisWorks/hypothesis/blob/master/hypothesis-python/src/hypothesis/internal/conjecture/data.py#L941. It seems the
MAX_DEPTH
value is hardcoded (https://github.com/HypothesisWorks/hypothesis/blob/master/hypothesis-python/src/hypothesis/internal/conjecture/data.py#L729), and by default there is zero visibility into sample rejection due to reachingMAX_DEPTH
.This issue is not about helping me fix my strategies to avoid these rejections (though of course I'd welcome any suggestions in that direction over on carljm/compgenerator#7).
The issue I'm reporting here is that it would have been nice if hypothesis' stats output would have been more helpful in identifying the problem, without making me resort to debugging internals. This could maybe be as simple as recording an event when
MAX_DEPTH
is hit?It also might be nice to have
MAX_DEPTH
configurable instead of hardcoded to 100, if in my particular case I'm ok with samples that have deeper trees? I could file a separate issue for that, if the suggestion makes any sense.The text was updated successfully, but these errors were encountered: