change individual metric score rounding #13316

brendankenny · 2021-11-03T19:51:45Z

historical background

Originally scores were going to be the primary way to communicate how a page did on individual metrics, with the actual numericValue (née rawValue) being less important and mostly included for automated tools consuming the LHR. We also decided to always round each metric score to a whole number (to two decimal places in the [0, 1] version) to keep score precision realistic for users.

Over time, the metric scores have disappeared from the html report almost completely, only taking part in how they're combined to make the overall perf score and in giving a color/icon to each metric. numericValues, meanwhile, are now displayed right at the top.

In that time, we've also moved from metric thresholds that weren't externally talked about in a very precise way (especially when we defined our curves using the "point of diminishing returns" and not the exact 10th/50th percentiles) to today, where many of the thresholds are shared with the field CWV thresholds and are published very publicly.

The problem

All this comes together to cause confusion for anyone trying to analyze Lighthouse results and expect metric numericValues, scores, and thresholds to all be consistent. This came up in the context of some Web Almanac work, where I made the claim that it was equivalent to either look at a TBT score (≥ 0.9 for good, < 0.5 for poor) or TBT numericValue (≥ 200ms for good, > 600 for poor).

However, these are not equivalent, because we round the score :) Actually, a TBT as high as 204.856ms will still get a TBT score of 90 and a TBT as high as 606.48ms will still get a score of 50.

Proposal

We still round the metric scores, but we always round down (use Math.floor() here). Any TBT score ≥ 90 will also have a numericValue ≥ 200.

This will change scores (so it would be a breaking change) but it's arguably correcting scores to what the threshold system was always intending.

Assuming metrics are currently evenly split between scores that are being rounded up and those being rounded down (probably reasonable), about half of sites will drop one point on each metric from this change. With our six weighted metrics (using LH v8/9 weights):

half of sites would lose 0.5 points in the overall perf score (with final rounding, possibly enough to drop them a point downwards)
10% of sites would lose 0.8 points
about 1.5% of sites would see a full point drop

edit: actually the overall score calc is easy to get from this. Given the above and if we assume scores are evenly spread in their fractional portion, the final rounded perf score will drop by 1 point for 50% of sites, the rest will see no change in their final score. We could run it against HTTP Archive numbers to get an empirical percentage, but given that it's a max 1 point drop regardless, a ballpark percentage seems fine.

Overall not a large change for a significant consistency gain.

The text was updated successfully, but these errors were encountered:

brendankenny · 2021-11-03T19:53:18Z

There will be numeric issues (e.g. with Math.floor(), getLogNormalScore({p10: 200, median: 600}, 200) gives 0.8999999314038525 and so would get floored to 0.89), but since we're rounding anyways, it would be easy to throw in a clamp against the threshold depending on if the numericValue is on one side of a threshold or the other.

connorjclark · 2021-11-15T22:53:16Z

from eng discussion:

let's change clampTo2Decimals to clampTo4Decimals
add some tests w/ numeric values around control points, assert that the value is passing as expected
possibly add special rounding treatment in computeLogNormalScore with numeric values around control points
let's not consider this a breaking change. the consistency out weighs the score change, and our intention of scores wrt control points makes this a bug. and the score change is literally a rounding error much less than expected variance.

brendankenny · 2022-01-12T00:30:16Z

Going to call this closed by #13392 and #13559. We may still want to switch to 4-digit clamping for scores, but that wouldn't have fixed this issue. There would still be very small differences between control points and expected scores, but more importantly we'd also have to make sure that any downstream tool or the report never show whole-number metric scores or they would fall prey to this same issue (e.g. Intl.NumberFormat formatting a score to '90' but using average color based on the full precision being less than 90).

clampTo4Decimals can still be easily layered on top of this, though (switching 100 to 10_000 here), and any downstream or report rounding will do the correct thing by default.

brendankenny added needs-discussion 10.0 labels Nov 3, 2021

github-actions bot assigned paulirish Nov 3, 2021

devtools-bot added the needs-priority label Nov 3, 2021

connorjclark removed the 10.0 label Nov 15, 2021

brendankenny added P1.5 and removed needs-discussion needs-priority labels Nov 17, 2021

brendankenny assigned brendankenny and unassigned paulirish Nov 17, 2021

brendankenny mentioned this issue Nov 17, 2021

core: ensure log-normal score is always in correct range #13392

Merged

brendankenny mentioned this issue Jan 11, 2022

core: ensure good and average scores start exactly at control points #13559

Merged

brendankenny closed this as completed Jan 12, 2022

connorjclark mentioned this issue Apr 5, 2022

report: use acronyms and round metrics for shorter calc url #10954

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change individual metric score rounding #13316

change individual metric score rounding #13316

brendankenny commented Nov 3, 2021 •

edited

Loading

brendankenny commented Nov 3, 2021

connorjclark commented Nov 15, 2021 •

edited by brendankenny

Loading

brendankenny commented Jan 12, 2022

change individual metric score rounding #13316

change individual metric score rounding #13316

Comments

brendankenny commented Nov 3, 2021 • edited Loading

The problem

Proposal

brendankenny commented Nov 3, 2021

connorjclark commented Nov 15, 2021 • edited by brendankenny Loading

brendankenny commented Jan 12, 2022

brendankenny commented Nov 3, 2021 •

edited

Loading

connorjclark commented Nov 15, 2021 •

edited by brendankenny

Loading