You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is about building a visualization dashboard which automatically displays the stats from the latest daily loadgen runs.
Context
The loadgen's primary use is to build a regression over time of the behavior of the SDK (see Agoric/agoric-sdk#3107). In this case time has 2 dimensions:
Behavior of the chain over the lifetime of the chain (performance should stay stable and not degrade)
Behavior of the chain when changes are introduced across revisions (performance should not become notably worse, and should hopefully get better)
The first is mostly captured by running loadgen cycles split in stages (currently 4 stages of 6 hours for the daily perf run), and comparing stages to each other. The second is captured by comparing summarized metrics between revisions (different daily perf runs).
Current tooling
Currently the stats are saved in a perf.jsonl file which contains a stream of CPU and Memory usage stats, and a final summary of all other stats. #43 deals with unifying these so that individual stats data point are outputted in the stream, and only summaries are generated at the end, possibly including summaries of the CPU and memory usage.
We would like to have a dashboard that shows the data detailed in Agoric/agoric-sdk#3107, which is automatically updated to include the results from the latest daily run.
If a run fails, the dashboard should make it obvious or possibly send alerts. It should also alert if no data has been received recently (to highlight a stuck loadgen for example)
The dashboard does not need to show data for a in-progress loadgen, that is a separate issue (TBD)
It would be great if the dashboard allowed easily generating new graphs from the existing data, or perform queries.
The text was updated successfully, but these errors were encountered:
Summary
This issue is about building a visualization dashboard which automatically displays the stats from the latest daily loadgen runs.
Context
The loadgen's primary use is to build a regression over time of the behavior of the SDK (see Agoric/agoric-sdk#3107). In this case time has 2 dimensions:
The first is mostly captured by running loadgen cycles split in stages (currently 4 stages of 6 hours for the daily perf run), and comparing stages to each other. The second is captured by comparing summarized metrics between revisions (different daily perf runs).
Current tooling
Currently the stats are saved in a
perf.jsonl
file which contains a stream of CPU and Memory usage stats, and a final summary of all other stats. #43 deals with unifying these so that individual stats data point are outputted in the stream, and only summaries are generated at the end, possibly including summaries of the CPU and memory usage.The visualization is done by extracting the stats summaries into a CSV file (see https://github.com/Agoric/testnet-load-generator/blob/main/scripts/perf_to_stats_csv.jq), and importing that in a Google Spreadsheet with some graphs.
Detailed requirements
We would like to have a dashboard that shows the data detailed in Agoric/agoric-sdk#3107, which is automatically updated to include the results from the latest daily run.
If a run fails, the dashboard should make it obvious or possibly send alerts. It should also alert if no data has been received recently (to highlight a stuck loadgen for example)
The dashboard does not need to show data for a in-progress loadgen, that is a separate issue (TBD)
It would be great if the dashboard allowed easily generating new graphs from the existing data, or perform queries.
The text was updated successfully, but these errors were encountered: