Visualize subsystem regression tests results #3531

AndreiEres · 2024-02-29T17:45:02Z

Subsystem regression tests for availability read and write introduced in #3311.

We should have some sort of graphs based on benchmark results to look at performance changes over the time. Visualized results should be made available to the community.

…#3311) ### What's been done - `subsystem-bench` has been split into two parts: a cli benchmark runner and a library. - The cli runner is quite simple. It just allows us to run `.yaml` based test sequences. Now it should only be used to run benchmarks during development. - The library is used in the cli runner and in regression tests. Some code is changed to make the library independent of the runner. - Added first regression tests for availability read and write that replicate existing test sequences. ### How we run regression tests - Regression tests are simply rust integration tests without the harnesses. - They should only be compiled under the `subsystem-benchmarks` feature to prevent them from running with other tests. - This doesn't work when running tests with `nextest` in CI, so additional filters have been added to the `nextest` runs. - Each benchmark run takes a different time in the beginning, so we "warm up" the tests until their CPU usage differs by only 1%. - After the warm-up, we run the benchmarks a few more times and compare the average with the exception using a precision. ### What is still wrong? - I haven't managed to set up approval voting tests. The spread of their results is too large and can't be narrowed down in a reasonable amount of time in the warm-up phase. - The tests start an unconfigurable prometheus endpoint inside, which causes errors because they use the same 9999 port. I disable it with a flag, but I think it's better to extract the endpoint launching outside the test, as we already do with `valgrind` and `pyroscope`. But we still use `prometheus` inside the tests. ### Future work * #3528 * #3529 * #3530 * #3531 --------- Co-authored-by: Alexander Samusev <41779041+alvicsam@users.noreply.github.com>

…paritytech#3311) ### What's been done - `subsystem-bench` has been split into two parts: a cli benchmark runner and a library. - The cli runner is quite simple. It just allows us to run `.yaml` based test sequences. Now it should only be used to run benchmarks during development. - The library is used in the cli runner and in regression tests. Some code is changed to make the library independent of the runner. - Added first regression tests for availability read and write that replicate existing test sequences. ### How we run regression tests - Regression tests are simply rust integration tests without the harnesses. - They should only be compiled under the `subsystem-benchmarks` feature to prevent them from running with other tests. - This doesn't work when running tests with `nextest` in CI, so additional filters have been added to the `nextest` runs. - Each benchmark run takes a different time in the beginning, so we "warm up" the tests until their CPU usage differs by only 1%. - After the warm-up, we run the benchmarks a few more times and compare the average with the exception using a precision. ### What is still wrong? - I haven't managed to set up approval voting tests. The spread of their results is too large and can't be narrowed down in a reasonable amount of time in the warm-up phase. - The tests start an unconfigurable prometheus endpoint inside, which causes errors because they use the same 9999 port. I disable it with a flag, but I think it's better to extract the endpoint launching outside the test, as we already do with `valgrind` and `pyroscope`. But we still use `prometheus` inside the tests. ### Future work * paritytech#3528 * paritytech#3529 * paritytech#3530 * paritytech#3531 --------- Co-authored-by: Alexander Samusev <41779041+alvicsam@users.noreply.github.com>

AndreiEres added the T12-benchmarks This PR/Issue is related to benchmarking and weights. label Feb 29, 2024

AndreiEres self-assigned this Feb 29, 2024

AndreiEres mentioned this issue Feb 29, 2024

subsystem-bench: add regression tests for availability read and write #3311

Merged

AndreiEres closed this as completed May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Visualize subsystem regression tests results #3531

Visualize subsystem regression tests results #3531

AndreiEres commented Feb 29, 2024 •

edited

Loading

Visualize subsystem regression tests results #3531

Visualize subsystem regression tests results #3531

Comments

AndreiEres commented Feb 29, 2024 • edited Loading

AndreiEres commented Feb 29, 2024 •

edited

Loading