Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: ensure good and average scores start exactly at control points #13559

Merged
merged 3 commits into from
Jan 12, 2022

Conversation

brendankenny
Copy link
Member

@brendankenny brendankenny commented Jan 11, 2022

part of #13316

(this description refers to scores between 0 and 100 because it makes talking about rounding and flooring easier, but everything is actually done on scores between 0 and 1)

Our raw log-normal scoring method takes audit values and generates percentile scores based on control points. e.g. TBT's p10 control point of 200 ensures that TBT values ≤ 200ms get a 90+ score and TBT values > 200ms get a score less than 90.

The problem outlined in #13316 is that audit.js then rounds the score prior to adding it to the audit result. A score less than 90 can be rounded up to 90 if it's 89.5 or above, so in practice the TBT p10 control point has actually been ~204.86ms.


The simplest fix is to just floor the score rather than rounding it. With the guaranteed score precision provided by #13392, we can be confident that Math.floor(percentile) means that a value is never given a score higher than a 50 or a 90 unless it's faster than the median or p10 control points.

Here is how the before/after rounding looks between a score of 20 and 30:

  • the black line is the raw value returned by statistics.getLogNormalScore
  • the blue line is the previous Math.round() clamping, where half the time the rounded score is above the raw score and half the time it's below
  • the red line is the new Math.floor() clamping, where the rounded score is always below the raw score. A rounded score of 90 can only have been from a 90+ raw score.

This does mean that the new rounding matches with the old rounding half the time, but the other half of the time it's 1 point lower. This is for individual audits/metrics; as discussed in #13316, we should expect about half of sites to lose 1 point in their final perf score and the other half to be unchanged.


However! There's a big issue with this: attainability of a perfect score. Previously a score of 99.5 would get rounded up to 100, so there was enough room there for some sites to get a perfect score with metric numbers that are attainable in the real world. Flooring the scores takes those sites to 99 instead.

A couple of different approaches are possible to correct this. One lightweight way is to apply a small correction to transition from the new floor clamping at 90 (where we need it to precisely match the control point) to the old rounding at 100 (where we want the old rounding up behavior). This spreads scores between 90 and 100 to a range of 90 to 100.5, so the concluding Math.floor() takes those top end scores to a 100.

Here is what flooring vs flooring with the correction looks like:

Below a score of 90, it's identical to just flooring. As it rises above 90 and approaches 100, it gets closer and closer to what the old Math.round() scoring looked like, sometimes under and sometimes over the raw score).

This might sound more complicated than it is. It's a linear ramp from 0 to 0.5 between scores of 90 to 100 ((score - 0.9)/0.1 * 0.005)

Previously any score over 99.5 would get rounded up to 100, now it's any score over ~99.524 will get rounded up. As a result, over 95% of sites that previously got a 100 on a particular audit will still get a perfect score. The analysis of the impact on the overall perf score depends a lot more on the distribution of site scores than the above estimate, but since it's strictly a score boost for some sites, we can at least expect less than half of sites to lose a point on the overall perf score.

@brendankenny brendankenny requested a review from a team as a code owner January 11, 2022 22:10
@brendankenny brendankenny requested review from connorjclark and removed request for a team January 11, 2022 22:10
Copy link
Collaborator

@connorjclark connorjclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Not much to say other than, pretty graphs 😍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants