-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comparing generated instances by print
cb output
#38
Comments
I'd strongly recommend not using this to indicate already tried while shrinking. Moving to a smaller representation of the same thing is great for shrinking! It decreases the example size, which improves the performance (and sometimes quality) of future shrinks. |
Perhaps not for shrinking, but it seems like there's less downside for reporting -- there's probably little value in reporting another failure that is represented with a byte-identical string, beyond noting it as another seed that leads to a failure. It's a different issue it if was initially a different failure that became shadowed by a common, simpler failure while shrinking, but failure tagging (#14) attempts to address that directly. |
Yeah, I agree, that seems perfectly sensible! Though I wonder if it has enough of a hit rate to be worth it. Do people often include pointer values in their output? If so the output probably isn't even stable for a fixed bit stream. (Hypothesis doesn't do this because the only time we show multiple final examples is when they lead to a different error) |
I don't know if people include pointers, but I've been watching theft report the an identical failure about once a minute today. (Generating source to fuzz a compiler.) The crux of the problem is: how do I detect that it IS a different error? (I'd rather not depend on instrumentation from a specific compiler, or other non-portable approaches.) |
Fair enough!
Yeah this is easier in languages that aren't C! In Hypothesis it's just line number + exception type. |
This is another case where the optional coverage reporting hook (#43) could be useful. |
With how autoshrinking works, there currently isn't a good way to detect generated instances that are structurally equivalent, but produced by a different random bitstream. This can lead to reporting the same failure several times, because it's generating the same input value by different code paths.
For example, the built-in generators have tables of interesting values, but the shrinker doesn't know whether a random
uint16_t
chose65535
through random generation, or because it came from the table.It might work well to treat instances whose
print
callback output are the same. There multi-core branch will already be adding buffering for each worker's stdout -- we could print to a buffer, check its hash in the bloom filter, and use that to determine whether it has already been tried. This probably shouldn't be enabled by default (at least not until it's already been optional for a release), because it could cause surprising interactions with existing tests.If this causes strange behavior during shrinking, it may not be worthwhile, but it would be a good experiment. (For example, it might get hung up on input generated one way, but refuse to shrink further because the intermediate output has already been marked as tried. Other bugs caused that to happen while working on autoshrinking.)
The text was updated successfully, but these errors were encountered: