-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change individual metric score rounding #13316
Comments
There will be numeric issues (e.g. with |
from eng discussion:
|
Going to call this closed by #13392 and #13559. We may still want to switch to 4-digit clamping for scores, but that wouldn't have fixed this issue. There would still be very small differences between control points and expected scores, but more importantly we'd also have to make sure that any downstream tool or the report never show whole-number metric scores or they would fall prey to this same issue (e.g.
|
historical background
Originally scores were going to be the primary way to communicate how a page did on individual metrics, with the actual
numericValue
(néerawValue
) being less important and mostly included for automated tools consuming the LHR. We also decided to always round each metric score to a whole number (to two decimal places in the [0, 1] version) to keep score precision realistic for users.Over time, the metric scores have disappeared from the html report almost completely, only taking part in how they're combined to make the overall perf score and in giving a color/icon to each metric.
numericValue
s, meanwhile, are now displayed right at the top.In that time, we've also moved from metric thresholds that weren't externally talked about in a very precise way (especially when we defined our curves using the "point of diminishing returns" and not the exact 10th/50th percentiles) to today, where many of the thresholds are shared with the field CWV thresholds and are published very publicly.
The problem
All this comes together to cause confusion for anyone trying to analyze Lighthouse results and expect metric numericValues, scores, and thresholds to all be consistent. This came up in the context of some Web Almanac work, where I made the claim that it was equivalent to either look at a TBT
score
(≥ 0.9 for good, < 0.5 for poor) or TBTnumericValue
(≥ 200ms for good, > 600 for poor).However, these are not equivalent, because we round the score :) Actually, a TBT as high as 204.856ms will still get a TBT score of 90 and a TBT as high as 606.48ms will still get a score of 50.
Proposal
We still round the metric scores, but we always round down (use
Math.floor()
here). Any TBT score ≥ 90 will also have a numericValue ≥ 200.This will change scores (so it would be a breaking change) but it's arguably correcting scores to what the threshold system was always intending.
Assuming metrics are currently evenly split between scores that are being rounded up and those being rounded down (probably reasonable), about half of sites will drop one point on each metric from this change. With our six weighted metrics (using LH v8/9 weights):
edit: actually the overall score calc is easy to get from this. Given the above and if we assume scores are evenly spread in their fractional portion, the final rounded perf score will drop by 1 point for 50% of sites, the rest will see no change in their final score. We could run it against HTTP Archive numbers to get an empirical percentage, but given that it's a max 1 point drop regardless, a ballpark percentage seems fine.
Overall not a large change for a significant consistency gain.
The text was updated successfully, but these errors were encountered: