-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: making the result table more compact #231
Comments
You can use a custom |
Thanks, that's a great workaround. I think there is still value in improving the default rendering, as it reduces both the time to write benchmarks and the time to interpret the results. |
The main issue is that same words are repeated because You can make a PR to refine the default, but:
|
It turns out that Are you suggesting to rename The standard deviations are indeed useful. I didn't want to remove them. What I found less useful is the distance between the two middle samples when there are even number of them. |
It was your suggestions ;-) I just think that 50% is less self-explanatory than med.
It's in fact the confidence interval around the mean that is displayed. |
The plus or minus needs to be there but one of the first things I did making my own table was drop the quotes around the numeric values. The fact that they’re inconsistent is just weird. Why is throughput in quotes and samples not in quotes? |
string vs. number handling in |
How about 'mean' instead of 'average'? And are the cool kids using p50 these days instead of 50%? I half-snoozed through confidence intervals in physics, but I believe that if you're showing ±2.4% then showing two decimal places as if they are significant figures is wrong. |
console.table is also responsible for that '(index)' nonsense as well. Hmmm. But it looks like console.table exists solely in the examples, so if someone tweaks the column names coming from tinybench, any additional cleanup could be represented in the examples by importing one of the console.table replacements. Based on my forks, it looks like I used https://github.com/ayonious/console-table-printer/ and then had trouble with it trying to put ansi color codes into non-interactive terminals. @jerome-benoit Do you have a cli table generator you like? (also none of the ones I can find support nested column names so that might have to be out) |
Reading the code will tell you. Unless you formally prove Student distribution based confidence interval computation is wrong for mean and MAD for median, they are as accurate as possible given the original Student assumptions. |
No the code isn’t going to tell me how physicists track significant figures. Even if you copied a physics book into the code it’s a tertiary reference. |
I disagree with this. All profilers lie, some more than most. The numbers you don’t think are necessary are used by the people who come after you declare this code is as fast as it can get and find another 10%. A broad interval and similar numbers to a previous run helps indicate that something interfered with this run and you should try it again. And that your change might not have accomplished anything. |
It's going to tell you how the confidence interval is actually computed, what is the method used and if it's correctly implemented, as a primary reference. Unless you have prove:
You are just making bold claims backed by ... nothing. |
You’re showing hundredths of a nanoseconds on a VM that is not even accurate to the nanosecond. Two decimal points on a median value is a bold claim. I agree with @pallosp that the decimal points are not helpful. In fact they cannot possibly be correct. And if multiple runs report values that don’t overlap in their error bars - which happens all the time - then what are the numbers telling a user? It’s like the cosmological crisis. I can’t reproduce my own results let alone someone else’s. Not even on the same hardware. |
You are basically saying that doing statistics over samples is incorrect, still without bringing any factual proofs of anything. Unless you are actually able to prove that:
There's no valuable point at continuing, with the uncertain hope that it will maybe lead to an interesting outcome for tinybench |
I don’t understand why you’re being so defensive. Can you get a timestamp from node in hundredths of a nanosecond? No, Can you trust the last digits of the nanosecond measure that’s a system call? Also no. So if I’m using essentially a child’s ruler with big fat millimeter markings on it, it’s inaccurate to try to record tenths of a millimeter with that measuring device. Median would at most every have one decimal point, if you believe that the 1’s digit in the nanoseconds is accurate, and that value can only ever be .5, or not exist at all. I’m talking about the measurements, you’re talking about the statistical analysis. |
Units in a sample used to do statiscals analysis is meaningless to any statistical indicators built on it by mathematic construction. It's so meaningless that any measurements inaccuraccy in the samples are traped by the proper indicators: standard deviation, confidence interval, skew, ... The decimals on a simple indicator that pass the statiscal significance common checks such as a mean are representative. The proof of it is part of any book good enough on statistics discussing the revelance of them. Building a meaningful criticism on mathematic tools requires an in-depth understanding of them in the first place. |
I was going to say this is something I covered in #65, but it turns out my memory is faulty, so here's an update showing how wonky hrtime() still is in Node 22. Also you're being super passive aggressive right now and I don't appreciate it. |
My workflow includes running benchmarks in the VS Code terminal regularly.
Currently I can't give the tasks longer names than 20 characters, because they would make the table overflow and break its layout. Therefore I'm suggesting a couple of changes to make the metric columns narrower.
(index)
#
. Saves 5-6 characters.Latency average
average
toavg.
Latency median
median
to50%
.These changes together may save 3-7 characters.
Throughput average
average
toavg.
Throughput median
median
to50%
. Saves 3 characters.Before
After
If you agree, I'm happy to send a pull request.
The text was updated successfully, but these errors were encountered: