-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature benchmark: repeated scenario runs #29637
feature benchmark: repeated scenario runs #29637
Conversation
misc/python/materialize/feature_benchmark/benchmark_versioning.py
Outdated
Show resolved
Hide resolved
I still need to validate and test the changes. |
misc/python/materialize/feature_benchmark/benchmark_versioning.py
Outdated
Show resolved
Hide resolved
f668f0e
to
662cceb
Compare
This causes way more flakes. We need to discuss at the onsite whether we want to proceed with this change and possibly increase thresholds or discard it. |
In my opinion we should keep feature-benchmark as is so we can keep catching regressions, and improve parallel-benchmark to be the one-run-reliable benchmarking framework. First issue about where parallel-benchmark was not entirely consistent: https://github.com/MaterializeInc/database-issues/issues/8571 |
I'll rebase this on top of #29664, maybe that helps? https://buildkite.com/materialize/nightly/builds/9644 |
e092cff
to
53b39ca
Compare
…esentative result
53b39ca
to
49ddec6
Compare
49ddec6
to
25f1872
Compare
…election strategy
25f1872
to
087b7bf
Compare
I changed this PR to still always conduct three runs per scenario but pick the best outcome (instead of the median outcome) with 7478e73. |
Good compromise |
959f401
into
MaterializeInc:main
This implements MaterializeInc/database-issues#8565.
Nightly
https://buildkite.com/materialize/nightly/builds?branch=nrainer-materialize%3Afeature-benchmark%2Frepeated-scenario-runs