-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce fuzz testing to separate unit tests from repetition tests #219
Conversation
5dbb063
to
33ef804
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-review
33ef804
to
f142e4d
Compare
Not now but... it would be really nice if we could have coverage guided fuzzing.
We'll still want random fuzzing, but this will help us make sure we at least cover as many error paths as possible. |
Indeed it would 👍 Captured at #220 |
I do have one concern: the tests will remain really fast even if other parts slow down. Basically, we'll just start losing test coverage. But that's probably not a huge deal (maybe run a nightly fuzzing job in CI that's longer)? |
Makefile
Outdated
test: | ||
GOGC=$(GOGC) go test ./... | ||
.PHONY: test | ||
|
||
fuzz: FUZZTIME ?= 10s # The duration to run fuzz testing, default to 10s if unset. | ||
fuzz: # List all fuzz tests across the repo, and run them one at a time with the configured fuzztime. | ||
@go run ./scripts/list_fuzz_tests | while read -r line; do \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future readers, this is parallelized because go will internally parallelize fuzzing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive me, but I don't think I'm aligned on goals here. This seems to be sacrificing test coverage (though I could well be misunderstanding something here) to make them faster and less deterministic. Faster is good, but I want more coverage and full determinism in most cases.
I would love to discuss this, but I don't see how changing the random seed without actually changing how the message queue works adds any coverage within the scope of an individual run (which must be repeatable to be useful). Yes, I could see that varying seeds in some long-running CI job could help, but the proposed change seems much deeper than that, with big effects on the standard dev flow.
Continuing from #219 (comment) and #219 (comment), I'd consider:
That gives us immediate "did we break something" feedback while leaving the slow tests to CI. Although I'm starting to think that should happen in this PR so we don't lose coverage and/or make the tests take longer. |
0ecae4d
to
8a46e3b
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #219 +/- ##
==========================================
+ Coverage 71.61% 71.66% +0.04%
==========================================
Files 31 31
Lines 2336 2333 -3
==========================================
- Hits 1673 1672 -1
+ Misses 533 532 -1
+ Partials 130 129 -1
|
Let's merge #230, rebase and compare then. |
Note, to compare coverage, don't just look at the numbers.
That'll let you explore the actual coverage diff. (Maybe codecov.io will do this? But I've never had a good experience with it) |
Codecov will do it (apart from the fact that right now we are collecting coverage across all packages, we might want to change it). |
I'm looking into parallelizing fuzzing "seeds" and it's not clear if it actually works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I am digesting this change more thoroughly, I see there are two different things going one. One is a consolidation of tests code, removing some duplication by moving to table-driven tests. This seems to be independent of the fuzzing. Next time it would be nice to review these separately.
I had not updated to the latest PR code when writing the below. I'm still nervous, but my initial observations were of an earlier version of the code. Invoking some of the fuzzing from
I remain a bit nervous about the effects of these changes on development and testing workflows. The tests that have been moved out to fuzzing now are not captured by running "all tests". "All tests" is now much too fast, so I have to run some fuzzing to have any confidence. Fuzzing is difficult to run from the IDE*. And I'm even more nervous about how hard it will be to replicate a single test instance (participants, config, random seed) to read the logs, insert a debugger, etc.
[UPDATE] I see that when a fuzz test fails, go outputs a command to re-run that test, which shouldn't be too hard to invoke from IDE context I have at least now discovered how to run fuzz tests one at a time in GoLand, but I wish it was simpler. Would it be unreasonable to suggest adding some fuzz tests in addition to retaining the existing tests on main? Consider them the regression tests, which we can add to if a bug is discovered by fuzzing. Perhaps we could turn down the iteration count a bit in that case. (*) My attempt to right-click and run all fuzz tests in
The shell script in the Makefile is too sophisticated for GoLand to naively replicate. Can we change something so that it's not necessary? |
@masih in case you're reading this in email, I retracted much of the above comment after realising I didn't have the latest code locally. |
c871b5a
to
095e47a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think we're on the same page now about replicating something similar to the existing coverage of inputs being tested (scenario x participant count x seed). I will stop objecting and leave it to other reviewers to 👍 the details.
I found the current test failure, so can confirm that the workflow for isolating a failing test, tracing, IDE support etc are adequate here. |
095e47a
to
9aba154
Compare
Folks this is ready for a final look; it:
Please let me know if there is anything that I might have missed. Happy fuzzing 🪲 |
147783a
to
4bf5f54
Compare
Fuzzing failed with
but IMO let's get this merged as it is a pure improvement given the baseline coverage without continuous fuzzing. |
0ab34b9
to
13060c5
Compare
see: golang/go#56238 |
Prior to changes introduced here the majority of time during testing was spent on repeating the same test while changing the seed for generators or latency models. Instead of repeating the same test over and over use the fuzz testing mechanism provided by Go SDK. This change allows the unit tests to run much faster while offering a configurable "fuzztime" that dictates for how long the test cases should be fuzzed. The changes here introduce the following fuzz tests: * `FuzzAbsentAdversary` * `FuzzImmediateDecideAdversary` * `FuzzHonest_AsyncRequireStrongQuorumToProgress` * `FuzzHonest_SyncMajorityCommonPrefix` * `FuzzHonest_AsyncMajorityCommonPrefix` * `FuzzHonestMultiInstance_AsyncDisagreement` * `FuzzHonestMultiInstance_SyncAgreement` * `FuzzHonestMultiInstance_AsyncAgreement` * `FuzzStoragePower_SyncIncreaseMidSimulation` * `FuzzStoragePower_AsyncIncreaseMidSimulation` * `FuzzStoragePower_SyncDecreaseRevertsToBase` * `FuzzStoragePower_AsyncDecreaseRevertsToBase` * `FuzzRepeatAdversary` The CI workflow is updated to run each of the fuzz tests for 30 seconds in a dedicated job.
13060c5
to
5d0cad7
Compare
Prior to changes introduced here the majority of time during testing was spent on repeating the same test while changing the seed for generators or latency models.
Instead of repeating the same test over and over use the fuzz testing mechanism provided by Go SDK. This change allows the unit tests to run much faster while offering a configurable "fuzztime" that dictates for how long the test cases should be fuzzed.
The changes here introduce the following fuzz tests:
FuzzAbsentAdversary
FuzzImmediateDecideAdversary
FuzzHonest_AsyncRequireStrongQuorumToProgress
FuzzHonest_SyncMajorityCommonPrefix
FuzzHonest_AsyncMajorityCommonPrefix
FuzzHonestMultiInstance_AsyncDisagreement
FuzzHonestMultiInstance_SyncAgreement
FuzzHonestMultiInstance_AsyncAgreement
FuzzStoragePower_SyncIncreaseMidSimulation
FuzzStoragePower_AsyncIncreaseMidSimulation
FuzzStoragePower_SyncDecreaseRevertsToBase
FuzzStoragePower_AsyncDecreaseRevertsToBase
FuzzRepeatAdversary
The CI workflow is updated to run each of the fuzz tests for 30 seconds in a dedicated job.
Resolves #218