Introduce fuzz testing to separate unit tests from repetition tests #219

masih · 2024-05-14T19:56:31Z

Prior to changes introduced here the majority of time during testing was spent on repeating the same test while changing the seed for generators or latency models.

Instead of repeating the same test over and over use the fuzz testing mechanism provided by Go SDK. This change allows the unit tests to run much faster while offering a configurable "fuzztime" that dictates for how long the test cases should be fuzzed.

The changes here introduce the following fuzz tests:

FuzzAbsentAdversary
FuzzImmediateDecideAdversary
FuzzHonest_AsyncRequireStrongQuorumToProgress
FuzzHonest_SyncMajorityCommonPrefix
FuzzHonest_AsyncMajorityCommonPrefix
FuzzHonestMultiInstance_AsyncDisagreement
FuzzHonestMultiInstance_SyncAgreement
FuzzHonestMultiInstance_AsyncAgreement
FuzzStoragePower_SyncIncreaseMidSimulation
FuzzStoragePower_AsyncIncreaseMidSimulation
FuzzStoragePower_SyncDecreaseRevertsToBase
FuzzStoragePower_AsyncDecreaseRevertsToBase
FuzzRepeatAdversary

The CI workflow is updated to run each of the fuzz tests for 30 seconds in a dedicated job.

Resolves #218

masih

Self-review

Makefile

Stebalien · 2024-05-14T20:14:25Z

Not now but... it would be really nice if we could have coverage guided fuzzing.

Do a bunch of fuzzing.
For each input, record the code coverage from that input.
If the input exercises new code, add it to the corpus.

We'll still want random fuzzing, but this will help us make sure we at least cover as many error paths as possible.

masih · 2024-05-14T20:18:37Z

Not now but... it would be really nice

Indeed it would 👍

Captured at #220

Stebalien · 2024-05-14T20:34:43Z

I do have one concern: the tests will remain really fast even if other parts slow down. Basically, we'll just start losing test coverage. But that's probably not a huge deal (maybe run a nightly fuzzing job in CI that's longer)?

Stebalien · 2024-05-14T20:43:24Z

Makefile

 test:
 	GOGC=$(GOGC) go test ./...
 .PHONY: test

+fuzz: FUZZTIME ?= 10s # The duration to run fuzz testing, default to 10s if unset.
+fuzz: # List all fuzz tests across the repo, and run them one at a time with the configured fuzztime.
+	@go run ./scripts/list_fuzz_tests | while read -r line; do \


For future readers, this is parallelized because go will internally parallelize fuzzing.

Makefile

anorth

Forgive me, but I don't think I'm aligned on goals here. This seems to be sacrificing test coverage (though I could well be misunderstanding something here) to make them faster and less deterministic. Faster is good, but I want more coverage and full determinism in most cases.

I would love to discuss this, but I don't see how changing the random seed without actually changing how the message queue works adds any coverage within the scope of an individual run (which must be repeatable to be useful). Yes, I could see that varying seeds in some long-running CI job could help, but the proposed change seems much deeper than that, with big effects on the standard dev flow.

test/constants_test.go

test/absent_test.go

test/decide_test.go

test/honest_test.go

test/multi_instance_test.go

Makefile

Stebalien · 2024-05-14T23:41:39Z

Continuing from #219 (comment) and #219 (comment), I'd consider:

Skipping this in make test (leaving a single fuzz run from go test ./....
Running a bit of fuzzing in CI (30s, one job per fuzzer ideally).
A longer fuzzing run in the merge queue.

That gives us immediate "did we break something" feedback while leaving the slow tests to CI. Although I'm starting to think that should happen in this PR so we don't lose coverage and/or make the tests take longer.

test/decide_test.go

codecov-commenter · 2024-05-15T17:23:43Z

Codecov Report

Attention: Patch coverage is 50.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 71.66%. Comparing base (837ffde) to head (5d0cad7).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #219      +/-   ##
==========================================
+ Coverage   71.61%   71.66%   +0.04%     
==========================================
  Files          31       31              
  Lines        2336     2333       -3     
==========================================
- Hits         1673     1672       -1     
+ Misses        533      532       -1     
+ Partials      130      129       -1

Files	Coverage Δ
sim/latency/log_normal.go	`91.30% <50.00%> (+6.68%)`	⬆️

Kubuxu · 2024-05-15T17:59:13Z

Let's merge #230, rebase and compare then.

Makefile

Stebalien · 2024-05-15T18:20:57Z

Note, to compare coverage, don't just look at the numbers.

Generate a report with go test -cover -covermode=atomic -coverpkg github.com/filecoin-project/go-f3/gpbft -coverprofile coverage.out ./.... -coverpkg ensures we don't get a bunch of noise because we're changing the test framework.
Trim the counts with cat coverage.out | grep -v ' 0$' | cut -d' ' -f-2 > coverage.trimmed.
Do the same thing on the branch you're comparing against.
Diff the results.

That'll let you explore the actual coverage diff.

(Maybe codecov.io will do this? But I've never had a good experience with it)

Kubuxu · 2024-05-15T18:28:48Z

Codecov will do it (apart from the fact that right now we are collecting coverage across all packages, we might want to change it).

Stebalien · 2024-05-15T18:31:40Z

I'm looking into parallelizing fuzzing "seeds" and it's not clear if it actually works.

anorth

As I am digesting this change more thoroughly, I see there are two different things going one. One is a consolidation of tests code, removing some duplication by moving to table-driven tests. This seems to be independent of the fuzzing. Next time it would be nice to review these separately.

Makefile

test/multi_instance_test.go

test/honest_test.go

test/util_test.go

anorth · 2024-05-15T23:04:22Z

I had not updated to the latest PR code when writing the below. I'm still nervous, but my initial observations were of an earlier version of the code. Invoking some of the fuzzing from go test seems like a good arrangement.

I remain a bit nervous about the effects of these changes on development and testing workflows. The tests that have been moved out to fuzzing now are not captured by running "all tests". "All tests" is now much too fast, so I have to run some fuzzing to have any confidence. Fuzzing is difficult to run from the IDE*. And I'm even more nervous about how hard it will be to replicate a single test instance (participants, config, random seed) to read the logs, insert a debugger, etc.

[UPDATE] I see that when a fuzz test fails, go outputs a command to re-run that test, which shouldn't be too hard to invoke from IDE context
[MORE UPDATE] I see that output in https://github.com/filecoin-project/go-f3/actions/runs/9100009614/job/25014012083?pr=219, but I can't replicate it locally after introducing a bug

I have at least now discovered how to run fuzz tests one at a time in GoLand, but I wish it was simpler.

Would it be unreasonable to suggest adding some fuzz tests in addition to retaining the existing tests on main? Consider them the regression tests, which we can add to if a bug is discovered by fuzzing. Perhaps we could turn down the iteration count a bit in that case.

(*) My attempt to right-click and run all fuzz tests in tests failed with

/opt/homebrew/opt/go/libexec/bin/go test -json github.com/filecoin-project/go-f3/test -fuzz . -run ^$
{"Time":"2024-05-16T10:59:41.884345+12:00","Action":"start","Package":"github.com/filecoin-project/go-f3/test"}
testing: will not fuzz, -fuzz matches more than one fuzz test: [FuzzAbsentAdversary FuzzImmediateDecideAdversary FuzzHonest_RequireStrongQuorumToProgress FuzzMultiSyncAgreement FuzzMultiAsyncAgreement FuzzRepeatAdversary]

The shell script in the Makefile is too sophisticated for GoLand to naively replicate. Can we change something so that it's not necessary?

anorth · 2024-05-15T23:31:45Z

@masih in case you're reading this in email, I retracted much of the above comment after realising I didn't have the latest code locally.

anorth

Thanks. I think we're on the same page now about replicating something similar to the existing coverage of inputs being tested (scenario x participant count x seed). I will stop objecting and leave it to other reviewers to 👍 the details.

test/constants_test.go

test/honest_test.go

test/multi_instance_test.go

test/honest_test.go

anorth · 2024-05-16T22:23:47Z

I found the current test failure, so can confirm that the workflow for isolating a failing test, tracing, IDE support etc are adequate here.

masih · 2024-05-17T12:42:32Z

Folks this is ready for a final look; it:

offers slightly higher coverage compared to main.
covers the previous test cases as static fuzz corpus.
refactors tests to remove duplicate code while generally keeping the same parameter search space as the tests in main branch.
introduces fuzzing to CI at 30s per fuzz test.
- I intend to reflect on suggestions to run longer nightly fuzzing / fuzz in parallel in a separate PR as this one is large enough already.

Please let me know if there is anything that I might have missed.

Happy fuzzing 🪲

Kubuxu · 2024-05-17T14:16:58Z

Fuzzing failed with

--- FAIL: FuzzHonestMultiInstance_SyncAgreement (19.62s)
    fuzzing process hung or terminated unexpectedly: exit status 2
    Failing input written to testdata/fuzz/FuzzHonestMultiInstance_SyncAgreement/3612110feb3427c0
    To re-run:
    go test -run=FuzzHonestMultiInstance_SyncAgreement/3612110feb3427c0

but IMO let's get this merged as it is a pure improvement given the baseline coverage without continuous fuzzing.

masih · 2024-05-17T14:37:10Z

fuzzing process hung or terminated unexpectedly: exit status 2

see: golang/go#56238

Prior to changes introduced here the majority of time during testing was spent on repeating the same test while changing the seed for generators or latency models. Instead of repeating the same test over and over use the fuzz testing mechanism provided by Go SDK. This change allows the unit tests to run much faster while offering a configurable "fuzztime" that dictates for how long the test cases should be fuzzed. The changes here introduce the following fuzz tests: * `FuzzAbsentAdversary` * `FuzzImmediateDecideAdversary` * `FuzzHonest_AsyncRequireStrongQuorumToProgress` * `FuzzHonest_SyncMajorityCommonPrefix` * `FuzzHonest_AsyncMajorityCommonPrefix` * `FuzzHonestMultiInstance_AsyncDisagreement` * `FuzzHonestMultiInstance_SyncAgreement` * `FuzzHonestMultiInstance_AsyncAgreement` * `FuzzStoragePower_SyncIncreaseMidSimulation` * `FuzzStoragePower_AsyncIncreaseMidSimulation` * `FuzzStoragePower_SyncDecreaseRevertsToBase` * `FuzzStoragePower_AsyncDecreaseRevertsToBase` * `FuzzRepeatAdversary` The CI workflow is updated to run each of the fuzz tests for 30 seconds in a dedicated job.

masih force-pushed the masih/fuzz-all-da-things branch from 5dbb063 to 33ef804 Compare May 14, 2024 19:58

masih commented May 14, 2024

View reviewed changes

Makefile Show resolved Hide resolved

masih requested review from anorth, Kubuxu and Stebalien May 14, 2024 20:00

masih force-pushed the masih/fuzz-all-da-things branch from 33ef804 to f142e4d Compare May 14, 2024 20:01

masih mentioned this pull request May 14, 2024

Add coverage guided fuzzing #220

Closed

Stebalien reviewed May 14, 2024

View reviewed changes

anorth reviewed May 14, 2024

View reviewed changes

Kubuxu reviewed May 15, 2024

View reviewed changes

test/decide_test.go Outdated Show resolved Hide resolved

masih force-pushed the masih/fuzz-all-da-things branch 2 times, most recently from 0ecae4d to 8a46e3b Compare May 15, 2024 17:19

Kubuxu reviewed May 15, 2024

View reviewed changes

Makefile Outdated Show resolved Hide resolved

anorth reviewed May 15, 2024

View reviewed changes

Makefile Show resolved Hide resolved

test/multi_instance_test.go Outdated Show resolved Hide resolved

test/honest_test.go Show resolved Hide resolved

anorth reviewed May 15, 2024

View reviewed changes

test/util_test.go Show resolved Hide resolved

masih force-pushed the masih/fuzz-all-da-things branch 3 times, most recently from c871b5a to 095e47a Compare May 16, 2024 20:00

anorth reviewed May 16, 2024

View reviewed changes

test/constants_test.go Show resolved Hide resolved

test/honest_test.go Outdated Show resolved Hide resolved

test/multi_instance_test.go Outdated Show resolved Hide resolved

anorth reviewed May 16, 2024

View reviewed changes

test/honest_test.go Outdated Show resolved Hide resolved

masih mentioned this pull request May 17, 2024

Systematically permutate message ordering in simulations #232

Open

masih force-pushed the masih/fuzz-all-da-things branch from 095e47a to 9aba154 Compare May 17, 2024 12:24

masih requested review from Stebalien and Kubuxu May 17, 2024 12:34

masih force-pushed the masih/fuzz-all-da-things branch 3 times, most recently from 147783a to 4bf5f54 Compare May 17, 2024 14:02

Kubuxu approved these changes May 17, 2024

View reviewed changes

masih force-pushed the masih/fuzz-all-da-things branch 3 times, most recently from 0ab34b9 to 13060c5 Compare May 17, 2024 14:21

masih force-pushed the masih/fuzz-all-da-things branch from 13060c5 to 5d0cad7 Compare May 17, 2024 16:09

masih added this pull request to the merge queue May 17, 2024

Merged via the queue into main with commit 7c2f72b May 17, 2024
4 checks passed

masih deleted the masih/fuzz-all-da-things branch May 17, 2024 16:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce fuzz testing to separate unit tests from repetition tests #219

Introduce fuzz testing to separate unit tests from repetition tests #219

masih commented May 14, 2024 •

edited

Loading

masih left a comment

Stebalien commented May 14, 2024

masih commented May 14, 2024

Stebalien commented May 14, 2024

Stebalien May 14, 2024

anorth left a comment

Stebalien commented May 14, 2024

codecov-commenter commented May 15, 2024 •

edited

Loading

Kubuxu commented May 15, 2024

Stebalien commented May 15, 2024

Kubuxu commented May 15, 2024

Stebalien commented May 15, 2024

anorth left a comment

anorth commented May 15, 2024 •

edited

Loading

anorth commented May 15, 2024

anorth left a comment

anorth commented May 16, 2024

masih commented May 17, 2024 •

edited

Loading

Kubuxu commented May 17, 2024

masih commented May 17, 2024 •

edited

Loading

Introduce fuzz testing to separate unit tests from repetition tests #219

Introduce fuzz testing to separate unit tests from repetition tests #219

Conversation

masih commented May 14, 2024 • edited Loading

masih left a comment

Choose a reason for hiding this comment

Stebalien commented May 14, 2024

masih commented May 14, 2024

Stebalien commented May 14, 2024

Stebalien May 14, 2024

Choose a reason for hiding this comment

anorth left a comment

Choose a reason for hiding this comment

Stebalien commented May 14, 2024

codecov-commenter commented May 15, 2024 • edited Loading

Codecov Report

Kubuxu commented May 15, 2024

Stebalien commented May 15, 2024

Kubuxu commented May 15, 2024

Stebalien commented May 15, 2024

anorth left a comment

Choose a reason for hiding this comment

anorth commented May 15, 2024 • edited Loading

anorth commented May 15, 2024

anorth left a comment

Choose a reason for hiding this comment

anorth commented May 16, 2024

masih commented May 17, 2024 • edited Loading

Kubuxu commented May 17, 2024

masih commented May 17, 2024 • edited Loading

masih commented May 14, 2024 •

edited

Loading

codecov-commenter commented May 15, 2024 •

edited

Loading

anorth commented May 15, 2024 •

edited

Loading

masih commented May 17, 2024 •

edited

Loading

masih commented May 17, 2024 •

edited

Loading