Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix osu! standardised score conversion sometimes exceeding bounds #25876

Merged
merged 5 commits into from
Dec 20, 2023

Conversation

bdach
Copy link
Collaborator

@bdach bdach commented Dec 18, 2023

Closes #25860.

Concept of the change provided by @Zyfarok.

Users reported that some stable scores would convert to large negative total scores in lazer after the introduction of combo exponent. Those large negative total scores were actually mangled NaNs.

The root cause of this was the following calculation going below zero unexpectedly:

remainingComboPortionInScoreV1 = comboPortionInScoreV1 - comboPortionFromLongestComboInScoreV1;

which then propagates negative numbers onward until

remainingComboPortionInStandardisedScore = remainingCountOfObjectsGivingCombo * Math.Pow(lengthOfRemainingCombos, ScoreProcessor.COMBO_EXPONENT);

which yields a NaN due to attempting to take the square root of a negative number.

To fix, clamp comboPortionInScoreV1 to sane limits: to comboPortionFromLongestComboInScoreV1 from below, and to maximumAchievableComboPortionInScoreV1 from above. This is a less direct fix than perhaps imagined, but it seems like a better one as it will also affect the calculation of both the lower and the upper estimate of the score.

Co-authored-by: Zyf <zyfarok@gmail.com>

Closes ppy#25860

Users reported that some stable scores would convert to large negative
total scores in lazer after the introduction of combo exponent. Those
large negative total scores were actually mangled NaNs.

The root cause of this was the following calculation going below zero
unexpectedly:

	https://github.com/ppy/osu/blob/8e8d9b2cd96be4c4b3d8f1f01dc013fc9d41f765/osu.Game/Database/StandardisedScoreMigrationTools.cs#L323

which then propagates negative numbers onward until

	https://github.com/ppy/osu/blob/8e8d9b2cd96be4c4b3d8f1f01dc013fc9d41f765/osu.Game/Database/StandardisedScoreMigrationTools.cs#L337

which yields a NaN due to attempting to take the square root of a
negative number.

To fix, clamp `comboPortionInScoreV1` to sane limits: to
`comboPortionFromLongestComboInScoreV1` from below, and to
`maximumAchievableComboPortionInScoreV1` from above. This is a less
direct fix than perhaps imagined, but it seems like a better one as it
will also affect the calculation of both the lower and the upper
estimate of the score.
@@ -316,8 +316,7 @@ private void convertLegacyTotalScoreToStandardised()

HashSet<Guid> scoreIds = realmAccess.Run(r => new HashSet<Guid>(r.All<ScoreInfo>()
.Where(s => !s.BackgroundReprocessingFailed && s.BeatmapInfo != null
&& (s.TotalScoreVersion == 30000002
|| s.TotalScoreVersion == 30000003))
&& s.TotalScoreVersion < LegacyScoreEncoder.LATEST_VERSION)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing, which consisted of:

  • importing the legacy replays from the issue thread on master
  • moving to this branch

this change appears to be required to have the NaN score actually recompute correctly.

I'm not sure why this code previously insisted on listing every individual version separately, and I even brought this up before. I'm also thinking that the lack of changing this might have had the effect of no preexisting scores getting converted last release?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From further testing, it seems that yes, users' pre-existing imported stable scores have actually not been recomputed on the latest release. See #25876 (comment) for more on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that this looks more correct.

@@ -26,7 +26,7 @@ public static bool ShouldMigrateToNewStandardised(ScoreInfo score)
if (score.IsLegacyScore)
return false;

if (score.TotalScoreVersion > 30000004)
if (score.TotalScoreVersion > 30000005)
Copy link
Collaborator Author

@bdach bdach Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tempted to also make this >= LegacyScoreDecoder.LATEST_VERSION at this point, but I also am deeply confused by the score version mechanism at this point. @smoogipoo please advise.

Copy link
Contributor

@smoogipoo smoogipoo Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work with the rest of the migration stuff? Afaik this was only here to convert scores <= 002 or so (old, linear-style standardised scoring) to the new ScoreV2-style one.

Edit: What I mean is, is there ever a chance this could run before scores are upgraded to LATEST_VERSION? If not, I agree with your proposal.

That being said, I personally feel like we should remove all of this score conversion stuff. We're not doing this server-side and there will also come a point where old lazer scores are removed from the leaderboards, at which point perhaps we can have an option to "show obsolete scores" or something like that to do the same for the local leaderboard (which this has more of an effect on).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is, is there ever a chance this could run before scores are upgraded to LATEST_VERSION?

Can you elaborate? I'm not sure I follow...

Copy link
Contributor

@smoogipoo smoogipoo Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say this changed line was merged in, and a score exists in the database with version 005.

  1. This condition now fails for that score.
  2. We compare GetOldStandardised(score) == score. This evaluates false.
  3. Likewise scoreIsVeryOld will also evaluate false.
  4. The method returns false and we'll "upgrade" the score via GetNewStandardised().

As I understand it, the above upgrade is lossy. I would prefer that we keep current scorev2-esque scores the way they are so this upgrade isn't done.

If the 005 score would first get upgraded to 006 via your change to the BackgroundDataStoreProcessor, that would cause the condition in (1) to now pass, then this change is fine.

This entire code block should only be valid for scores set in that linear standardised scoring, before scorev2 was done.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4. The method returns false and we'll "upgrade" the score via GetNewStandardised()

Um... If ShouldMigrateToNewStandardised() returns false, which as you said it will, then the score will not be upgraded...?

if (StandardisedScoreMigrationTools.ShouldMigrateToNewStandardised(model))
model.TotalScore = StandardisedScoreMigrationTools.GetNewStandardised(model);
else if (model.IsLegacyScore)
{
model.LegacyTotalScore = model.TotalScore;
model.TotalScore = StandardisedScoreMigrationTools.ConvertFromLegacyTotalScore(model, beatmaps());
}

if (StandardisedScoreMigrationTools.ShouldMigrateToNewStandardised(score))
{
try
{
long calculatedNew = StandardisedScoreMigrationTools.GetNewStandardised(score);
score.TotalScore = calculatedNew;
}
catch
{
}
}

On the other side, scores imported from stable receive version 30000001:

// TotalScoreVersion gets initialised to LATEST_VERSION.
// In the case where the incoming score has either an osu!stable or old lazer version, we need
// to mark it with the correct version increment to trigger reprocessing to new standardised scoring.
//
// See StandardisedScoreMigrationTools.ShouldMigrateToNewStandardised().
scoreInfo.TotalScoreVersion = version < 30000002 ? 30000001 : LegacyScoreEncoder.LATEST_VERSION;

so the BackgroundDataStoreProcessor change was supposed to make them reprocess...

That said I'll go recheck empirically again because this stuff is very confusing (as it was every time up to now).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I mentally inverted the condition of oldScoreMatchesExpectations. Probably worth going over this again with a fine comb, because it's very hard to understand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this with a finer comb again,

Afaik this was only here to convert scores <= 002 or so (old, linear-style standardised scoring) to the new ScoreV2-style one.

I'm pretty sure you are correct here and this was wrongly bumped in 5dee438, with the thought that it was going to lead to scores getting reprocessed, while it was BackgroundDataStoreProcessor that was supposed to be changed.

I've reverted this in ddb67c8, but I'd probably want to double-check with @peppy that that is what happened there...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything said here looks to check out.

@bdach
Copy link
Collaborator Author

bdach commented Dec 18, 2023

!diffcalc
GENERATORS=score

i wonder if this is gonna work even...

@smoogipoo
Copy link
Contributor

Diffcalc died because we're changing osu-queue-score-statistics and table structures every other day... 🤦

@smoogipoo
Copy link
Contributor

May be fixed now...

!diffcalc
GENERATORS=score

@bdach bdach added the next release Pull requests which are almost there. We'll aim to get them in the next release, but no guarantees! label Dec 19, 2023
@smoogipoo
Copy link
Contributor

Looks like this PR failed during the import (on this branch, not master):

2023-12-19T06:32:49.3902700Z execution-7256499449-6077-1-importer-1   | System.AggregateException: One or more errors occurred. ('319225' cannot be greater than 318096.)
2023-12-19T06:32:49.3903927Z execution-7256499449-6077-1-importer-1   |  ---> System.ArgumentException: '319225' cannot be greater than 318096.
2023-12-19T06:32:49.3904953Z execution-7256499449-6077-1-importer-1   |    at System.Math.ThrowMinMaxException[T](T min, T max)
2023-12-19T06:32:49.3907291Z execution-7256499449-6077-1-importer-1   |    at osu.Game.Database.StandardisedScoreMigrationTools.ConvertFromLegacyTotalScore(ScoreInfo score, LegacyBeatmapConversionDifficultyInfo difficulty, LegacyScoreAttributes attributes) in /tmp/tmp.5hV96I4JK5/osu/osu.Game/Database/StandardisedScoreMigrationTools.cs:line 239

Unfortunately, I can't get you the actual score ID immediately - needs a change to osu-queue-score-statistics that peppy's working on.

@bdach
Copy link
Collaborator Author

bdach commented Dec 19, 2023

That must be the clamp throwing. Initially I thought the score may not matter, but now that I look at this again, because

double comboPortionFromLongestComboInScoreV1 = Math.Pow(score.MaxCombo, 2);

double maximumAchievableComboPortionInScoreV1 = Math.Pow(maximumLegacyCombo, 2);

this would imply maximumLegacyCombo < score.MaxCombo which seems... worrying to say the least, so I think I'll wait for the score ID...

@smoogipoo
Copy link
Contributor

Let's try this again, this time with ppy/osu-queue-score-statistics#186

!diffcalc
GENERATORS=score

Copy link

Difficulty calculation queued -- please wait! (https://github.com/ppy/osu/actions/runs/7258858973)

This comment will update on completion

@Zyfarok
Copy link
Contributor

Zyfarok commented Dec 19, 2023

I've thought about this for a while and I think instead of clamping by maximumAchievableComboPortionInScoreV1 we should clamp by the maximum theoretically achievable with the given max-combo and miss count, in a similar spirit to how we compute the pessimistic estimate.

double maximumOccurrencesOfLongestCombo = Math.Floor(score.MaxCombo / (maximumLegacyCombo - score.Statistics.GetValueOrDefault(HitResult.Miss)));
double remainingCombo = maximumLegacyCombo - score.Statistics.GetValueOrDefault(HitResult.Miss) - (score.MaxCombo * maximumOccurrencesOfLongestCombo);
double maximumAchievableComboPortionWithThisComboInScoreV1 = Math.Pow(score.MaxCombo, 1.5) * maximumOccurrencesOfLongestCombo + Math.Pow(remainingCombo, 1.5);

I'm not sure if it's needed to ensure scores don't go over 1M but if there's any edge-cases left it should cover those and it should improve accuracy of conversion on some near-FC scores.

@bdach
Copy link
Collaborator Author

bdach commented Dec 19, 2023

I don't even want to consider doing that before #25876 (comment) is thoroughly investigated, because the current version dying on some score like that looks pretty worrying and may indicate / reveal further errors in assumptions.

@bdach
Copy link
Collaborator Author

bdach commented Dec 19, 2023

In the interest of pushing this out in a release I have replaced @Zyfarok's original proposal with a more conservative fix that should both address the issue and not die in the way described in #25876 (comment).

I will still 100% want to investigate that discrepancy, but it is what it is for now. @peppy your call if you feel comfortable including this in the hotfix, especially given the changes around the retroactive score re-conversion process.

@Zyfarok
Copy link
Contributor

Zyfarok commented Dec 19, 2023

Ah I skipped your comment because I didn't get it but now I see. So basically we probably have score.MaxCombo bigger than maximumLegacyCombo (unless something weird happens in the pow ?), which is indeed concerning.
Could it be that the scores in question are not legacy scores ? Or maybe that they abuse of a mechanic that once allowed someone to get more combo ? (is there anything like this ? yesterday someone told me it was possible to spam repeat-sliders to get more score than auto-play in stable)

@bdach
Copy link
Collaborator Author

bdach commented Dec 19, 2023

We don't know for now and I'm waiting for @smoogipoo to fish out the relevant score ID for me so that we can investigate further.

@peppy peppy self-requested a review December 19, 2023 17:18
@smoogipoo
Copy link
Contributor

Here's one of the failures (more can be found at the bottom of the log):

System.AggregateException: One or more errors occurred. (Processing legacy score 1218406320 failed. ('319225' cannot be greater than 318096.))

@bdach
Copy link
Collaborator Author

bdach commented Dec 19, 2023

I.......

https://osu.ppy.sh/beatmapsets/33787#osu/110586

Half of the high scores on this map have 566x green combo, half 564x. I take https://osu.ppy.sh/scores/osu/420517214 and play it in stable.

screenshot012

It never reaches 566x combo.

I feel I shouldn't be surprised by this at this point, but...

I'll look at the other cases at some point tomorrow but I'm not sure what implications this even has on everything? Do we just pretend this didn't exist and guard every single read of combo from a legacy score to max as given by the difficulty attributes?

@Zyfarok
Copy link
Contributor

Zyfarok commented Dec 19, 2023

Yeah, the sane strategy seems to cap everything (combo, total score, miss-count?) between expected bounds I would say. As I mentioned above, someone explained and showed me examples of people abusing buzz-sliders to get scores above SS in stable too...

@Zyfarok
Copy link
Contributor

Zyfarok commented Dec 19, 2023

Also I think as I've mentioned yesterday that it might be better to remove the division by accuracy to remove edge-case where it might overestimate, because my assumptions about accuracy seems realistic-enough for FCs, but thinking about it, it's not so much realistic for other scores (most accuracy drops are usually in the hard-parts where you don't have combo)
And for FC scores (most leaderboards), this won't decrease the accuracy of conversion now that we have the max:

double comboPortionInScoreV1 = Math.Max(maximumAchievableComboPortionInScoreV1 * comboProportion, comboPortionFromLongestComboInScoreV1);

@smoogipoo
Copy link
Contributor

I'll look at the other cases at some point tomorrow but I'm not sure what implications this even has on everything? Do we just pretend this didn't exist and guard every single read of combo from a legacy score to max as given by the difficulty attributes?

I believe this is the best you can do, yeah. It's likely this was a result of a "hax" update that's sometimes done manually by us to fix an issue without clearing leaderboards.

@smoogipoo
Copy link
Contributor

!diffcalc
GENERATORS=score

Copy link

github-actions bot commented Dec 20, 2023

@bdach
Copy link
Collaborator Author

bdach commented Dec 20, 2023

So the run completed, which is good, but... I'm not sure how to read the spreadsheet...

Definitely didn't expect >10k scores to change, definitely did not expect swings of 30%. Looking at a few cases sorted by biggest percentage increase:

@peppy @smoogipoo I know you guys have been pushing for merge of this to start score import going, but I don't know what to do at this point. The above does not inspire confidence, and I'm not sure how long it'll take me to get to the bottom of these if that is required.

At the least this currently appears to fix the issue of NaNs / very negative totals?

@Zyfarok
Copy link
Contributor

Zyfarok commented Dec 20, 2023

I asked for an example of map where player abused buzz-sliders to get more than SS score, here is an example (no spinner on the map!):
https://osu.ppy.sh/beatmapsets/337527#osu/746737
image

@bdach
Copy link
Collaborator Author

bdach commented Dec 20, 2023

@Zyfarok can you elaborate what that has to do with anything i mentioned here?

@Zyfarok
Copy link
Contributor

Zyfarok commented Dec 20, 2023

What this has to do is that even comboProportion can even be >1 so we really should cap everything between known bounds (not only combo)

@bdach
Copy link
Collaborator Author

bdach commented Dec 20, 2023

Merging as is to prevent this blocking score imports further.

This made the last hotfix release verbatim so I'm interpreting that as implicit approval.

Copy link
Contributor

@smoogipoo smoogipoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in as it fixes the immediate issue. We can adjust later with any more changes that are required.

@smoogipoo smoogipoo merged commit 7227404 into ppy:master Dec 20, 2023
13 of 17 checks passed
@bdach bdach deleted the fix-standardised-score-conversion branch December 20, 2023 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:scoring next release Pull requests which are almost there. We'll aim to get them in the next release, but no guarantees! ruleset/osu! size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plays from stable showing as a large negative number score in local leaderboards
4 participants