Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update misleading statistics titles #113

Open
ericharmeling opened this issue Jan 4, 2022 · 0 comments
Open

Update misleading statistics titles #113

ericharmeling opened this issue Jan 4, 2022 · 0 comments
Assignees

Comments

@ericharmeling
Copy link
Contributor

Problem

@fabiog1901 recently expressed concern over the accuracy of the statistics calculator.

... these are the stats output from the python workload movr: don't these stats look wrong?

transaction name      time(total)    ops(total)    ops    ops/second    p50(ms)    p90(ms)    p95(ms)    max(ms)
------------------  -------------  ------------  -----  ------------  ---------  ---------  ---------  ---------
add vehicle                   136            90      7     0.0515746      15.89      20.83      23.48      26.12
apply promo code              136           172     19     0.139987       23.53      34.08      37.68      39.52
end ride                      136           399     37     0.272606       28.7       36.69      37.68      53.11
get vehicles                  136         28450   3307    24.3646         16.92      22.92      27.38     107.2
log ride location             136         15550   1623    11.9574         15.21      18.71      20.82      57.89
new promo code                136            48      4     0.0294694      15.75      18.39      18.93      19.47
new user                      136           397     41     0.30206        15.54      18.32      18.74      57.73
start ride                    136           449     55     0.405197       28.86      41.32      42.93      50.45 

Take the "get vehicles" row: in 136s the workload has processed 28450, so 28450/136= 209 ops/second, not 24. 24 is the result of the partial ops 3307/136, which does not make sense calculating. Rather, it should be ops/current_period_duration, that is, 3307/15s = 220

I think these column labels are just misleading. The ops(total) column label should be a cumulative total for the lifetime of the generator. It follows that time(total) should also be a cumulative time elapsed for the lifetime of the generator, but it appears that it is, in fact, just the lifetime of the stats calculator (what Fabio called current_period_duration).

ops/second is roughly ops/time(total) (https://github.com/cockroachdb/movr/blob/master/movr_stats.py#L45). The reason it is so imprecise appears to be because the time measurement displayed for time(total) is rounded to the nearest integer.

Suggested resolution

Update the stats table titles to make the time statistic title less confusing:
https://github.com/cockroachdb/movr/blob/master/movr_stats.py#L54

We could also potentially update the time measurement to be a little more precise.

@ericharmeling ericharmeling self-assigned this Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant