-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
constrain generation better to filter less? #7
Comments
I looked at this a little bit (by running it and printing out all the rejected code). We actually don't generate that much code that doesn't compile; there's occasionally some with an invalid I tried locally to improve things by constraining the presence of |
I think I've identified the problem here, but haven't identified the solution yet. I made some changes to try to generate fewer invalid examples (see #8), but I still get hypothesis stats looking like this (on a run of just
You can see from the events (and I confirmed by printing uncompilable examples) that we hardly filter anything due to incompilability anymore (and we filter out nothing due to no-listcomps), and yet we still have ~98% invalid examples. Why? I turned on debug logging in Hypothesis (
I dug into hypothesis' source code and some extra debug prints confirmed that each of those lines is a time that we hit this case: https://github.com/HypothesisWorks/hypothesis/blob/master/hypothesis-python/src/hypothesis/internal/conjecture/data.py#L941 So it seems like internally, hypothesis has a hard-coded max depth of 100 (https://github.com/HypothesisWorks/hypothesis/blob/master/hypothesis-python/src/hypothesis/internal/conjecture/data.py#L729), and when you hit that max depth, it starts silently discarding samples, in a way that's effectively invisible in the stats. I expect the reason for this is that our strategies are too recursive, but I'm not sure how to fix this. Maybe using |
We still filter out a lot of generated samples as non-compilable or not containing a listcomp. If we could constrain generation better and filter less (without excluding interesting examples), it would make the fuzzer run a lot faster.
The text was updated successfully, but these errors were encountered: