-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incorrect case sensitivity in FrozenDictionary and FrozenSet for some cases #93986
Fix incorrect case sensitivity in FrozenDictionary and FrozenSet for some cases #93986
Conversation
Tagging subscribers to this area: @dotnet/area-system-collections Issue DetailsSee linked issue #93974 This is just the set of failing tests. Both suggestions in the linked issue have been attempted on my local and they both fix the tests, but I would like confirmation of which approach to go for.
|
Based on the benchmarks that you shared, I would be inclined to say that adding a new class is the best way to go. I wouldn't expect this could regress frozen dictionary construction performance substantially. cc @adamsitnik @stephentoub |
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenDictionary.cs
Show resolved
Hide resolved
I get this benchmark. (using microbenchmarks from dotnet/performance#3432)
|
Might this need backport to 8.0.1, as correctness? |
@danmoseley yes this is about correctness |
15fb484
to
ff89d40
Compare
…hould be the same
ff89d40
to
328d9f1
Compare
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenSet.cs
Show resolved
Hide resolved
? new OrdinalStringFrozenDictionary_FullCaseInsensitiveAscii<TValue>(keys, values, stringComparer, analysis.MinimumLength, analysis.MaximumLengthDiff) | ||
: new OrdinalStringFrozenDictionary_FullCaseInsensitive<TValue>(keys, values, stringComparer, analysis.MinimumLength, analysis.MaximumLengthDiff); | ||
} | ||
else | ||
{ | ||
// if (IgnoreCase) => Can only be true if there are no letters, thus case sensitive comparison still works here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an assertion that could replace this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so because the assertion would just be re-calculating what the KeyAnalyzer
does.
...ctions/Frozen/String/OrdinalStringFrozenDictionary_LeftJustifiedSingleCharCaseInsensitive.cs
Show resolved
Hide resolved
@@ -0,0 +1,29 @@ | |||
// Licensed to the .NET Foundation under one or more agreements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following naming convention of other files already committed, shouldn't this be OrdinalStringFrozenDictionary_LeftJustifiedCaseInsensitiveSingleChar.cs
?
|
||
namespace System.Collections.Frozen | ||
{ | ||
internal sealed class OrdinalStringFrozenDictionary_LeftJustifiedSubstringCaseInsensitive<TValue> : OrdinalStringFrozenDictionary<TValue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
namespace System.Collections.Frozen | ||
{ | ||
internal sealed class OrdinalStringFrozenDictionary_RightJustifiedSubstringCaseInsensitive<TValue> : OrdinalStringFrozenDictionary<TValue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
namespace System.Collections.Frozen | ||
{ | ||
internal sealed class OrdinalStringFrozenSet_LeftJustifiedSubstringCaseInsensitive : OrdinalStringFrozenSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
namespace System.Collections.Frozen | ||
{ | ||
internal sealed class OrdinalStringFrozenSet_RightJustifiedSubstringCaseInsensitive : OrdinalStringFrozenSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seem to be a few implementations being duplicated. Renaming the types to follow existing naming conventions should result in the those conflicts being identified.
@eiriktsarpalis The files named differently are implemented differently. The difference is in whether the hash is case insensitive or both the hash and the equality are case insensitive. Previously the equality was always case sensitive which is the bug being fixed. I'm very happy to take suggestions for better naming. Existing types which aren't relevant to this PR:
Existing types which are being used incorrectly: these do case sensitive hash and case sensitive equality.
New types in this PR: these do case sensitive hash and case insensitive equality which is the bug fix, really.
If this is too confusing and too niche and will have drawbacks with AOT trimming then I'm happy to entirely remove the optimization too to give time to rethink it for .NET 9. |
Given this is a .NET 8 backport candidate let's try to fix the functional issue with a minimal diff, even if it does regress perf. How does performance compare between |
I have now added 2 more implementations to my benchmark and upped my .NET Binaries footprint to 80GB 😱
Removing the optimization will be a regression (haven't got a measurement*) only in the case where we are case insensitive, have identified a partial substring to hash, and that substring contains no letters. In all other cases it matches .NET 8 release. * There aren't any benchmarks for case insensitive frozen dictionaries and I don't have the time to add some. However the regression is simply using a case insensitive hash of up to 8 characters so it's probably not significant. StringComparer has a performance regression for most stringly keyed frozen dictionaries whether case sensitive or not. StringComparison has the same problem and performs slower. The current state of this PR (specialized classes) represents no performance regression however is somewhat harder to reason about and adds more code which might represent a regression in static footprint as I understand it (?). Actual benchmark results below. I am inclined to simplify and just remove the buggy optimization because:
Thoughts? Perf_DefaultFrozenDictionary
Perf_LengthBucketsFrozenDictionary
Perf_SingleCharFrozenDictionary
Perf_SubstringFrozenDictionary
|
@andrewjsaid let's go with the final option - disabling the optimization causing the bug. Would it be possible to open that in a separate PR so that the two diffs can be compared by other reviews as well? |
@eiriktsarpalis done. Please see #94667 to compare to this PR. |
Closing in favor of #94667. |
See linked issue #93974
I've fixed by adding new subclasses for these specific use-cases. Please review thoroughly.