Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature benchmark: repeated scenario runs #29637

Conversation

nrainer-materialize
Copy link
Contributor

@nrainer-materialize nrainer-materialize commented Sep 18, 2024

@nrainer-materialize nrainer-materialize added the T-testing Theme: tests or test infrastructure label Sep 18, 2024
@nrainer-materialize nrainer-materialize self-assigned this Sep 18, 2024
@nrainer-materialize
Copy link
Contributor Author

I still need to validate and test the changes.

@nrainer-materialize nrainer-materialize force-pushed the feature-benchmark/repeated-scenario-runs branch from f668f0e to 662cceb Compare September 19, 2024 09:22
@nrainer-materialize
Copy link
Contributor Author

This causes way more flakes. We need to discuss at the onsite whether we want to proceed with this change and possibly increase thresholds or discard it.

@def-
Copy link
Contributor

def- commented Sep 19, 2024

In my opinion we should keep feature-benchmark as is so we can keep catching regressions, and improve parallel-benchmark to be the one-run-reliable benchmarking framework. First issue about where parallel-benchmark was not entirely consistent: https://github.com/MaterializeInc/database-issues/issues/8571

@def-
Copy link
Contributor

def- commented Sep 19, 2024

I'll rebase this on top of #29664, maybe that helps? https://buildkite.com/materialize/nightly/builds/9644
Edit: Didn't help

@def- def- force-pushed the feature-benchmark/repeated-scenario-runs branch 2 times, most recently from e092cff to 53b39ca Compare September 19, 2024 15:12
@nrainer-materialize nrainer-materialize force-pushed the feature-benchmark/repeated-scenario-runs branch from 53b39ca to 49ddec6 Compare September 20, 2024 07:33
@nrainer-materialize nrainer-materialize force-pushed the feature-benchmark/repeated-scenario-runs branch from 49ddec6 to 25f1872 Compare September 20, 2024 08:24
@nrainer-materialize nrainer-materialize force-pushed the feature-benchmark/repeated-scenario-runs branch from 25f1872 to 087b7bf Compare September 20, 2024 08:39
@nrainer-materialize
Copy link
Contributor Author

In my opinion we should keep feature-benchmark as is so we can keep catching regressions

I changed this PR to still always conduct three runs per scenario but pick the best outcome (instead of the median outcome) with 7478e73.

@def-
Copy link
Contributor

def- commented Sep 20, 2024

Good compromise

@nrainer-materialize nrainer-materialize merged commit 959f401 into MaterializeInc:main Sep 20, 2024
19 of 22 checks passed
@nrainer-materialize nrainer-materialize deleted the feature-benchmark/repeated-scenario-runs branch September 20, 2024 11:44
@github-actions github-actions bot locked and limited conversation to collaborators Sep 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T-testing Theme: tests or test infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants