-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add min_leaves: int = 0
argument to st.recursive()
#4205
Comments
I'll defer to @Zac-HD, but I'm not opposed to a Since automatically varying global size is one of the strengths of PBT, we try not to expose too much control over this distribution or growth rate to the user, who might dramatically weaken their strategy by setting it (as in your other suggestions). |
Are there any methods currently available for changing size? It is a very sharp tool, and nothing that should go on the top of a tutorial, but making it somehow available will be good for some cases. I agree that it can weaken strategies, but it can also strengthen them. One problem with Also there seems to be something strange with how distributions are effected by changing
Which peaks out at generating 555 leaves at 100 examples. Now, I would expect the speed at which examples increase in size to be proportional to |
As a deliberate design choice, Hypothesis offers user control over the domain of generated values, but not over the distribution. Adding a So... why not allow control of the distribution?
Finally, Hypothesis just doesn't necessarily try to hit the maximum size you specify, and we have an internal size limit on the number of choices we make during generation which usually nets out to a few thousand elements at most. This is also arbitrary, but in practice gives a good balance between performance and diversity which tends to maximize bug-finding power. |
Thanks for the detailed reply! I'll have a look at For my use-case, hypothesis (or even python) runtime is insignificant, the test itself runs CLI tools that are very slow relative to generation, it's essentially integration testing of CLI tools with python/pytest/hypothesis as a test-running framework. There are also good reasons to believe that we will need to hit some (unknown) size threshold to trigger a bunch of important code paths (attempts at compression etc.). With that context, I don't think fuzzing is really an option at this point, and why I really want to tell hypothesis to make some large collections and then shrink until there is a minimal example. While I understand the concerns about API complexity and backwards compatibility, there should be a way to give some control to the user. If the documentation if structured so it is clear that these are very sharp tools I don't think it's a problem that they exist themselves. Then, some limitations:
That said there is little point if we will just hit the internal size limit and that cannot be changed. |
No docs, because it's explicitly an internal implementation detail, and likely to change over the next few months as we work on #3921. I think this is a case where monkeypatching is probably the way to go; you can see the average-size calculations here and patch the inherent size limits here; you may also want or need to mess with the |
#3921 seems like a good project. Started experimenting with
Could file it as a separate bug report if that makes sense. Also going from 100 to 1000 examples does not increase |
min_leaves: int = 0
argument to st.recursive()
Hello,
It is possible this is somewhere in the documentation, but I could not find it. I have a recursive strategy, and I would like to generate very large recursive structures with a low amount of examples. However there don't seem to be a way to force hypothesis to start generating large values.
Another framework I used has a
max_size
parameter, where sizes go from 0 tomax_size
over the number of examples, so if we want faster growth that can be set to some larger number as needed.There is a couple of options here, either we could allow for specifying some size parameter in the environment (like
@settings
for example), or there could be some strategies for scaling other strategies like scale in this other framework. Lastly there could be amin_leaves
added torecursive
.Also, just to be clear, I much prefer hypothesis over the other framework, it is only used as a reference because it allows for manipulating the underlying size concept more directly, which is something that would simplify my use-case a good bit.
My use-case is essentially that there is a significant overhead to running a single example, while the impact of a large input is insignificant by comparison.
EDIT: also this is a question/enhancement, could not figure out how to set labels
The text was updated successfully, but these errors were encountered: