Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An edge case when devirtualization doesn't help at all #103280

Open
YohDeadfall opened this issue Jun 11, 2024 · 2 comments
Open

An edge case when devirtualization doesn't help at all #103280

YohDeadfall opened this issue Jun 11, 2024 · 2 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@YohDeadfall
Copy link
Contributor

Description

Recently I wrote an article explaining benchmark results which João Antunes had in his own article, and it's seems that it's indeed an edge case when devirtualization gives nothing except code size. The problem hides in the slow resolution of the default equality comparer.

public readonly record struct StronglyTypedKey<T>(T Value);
var dictionary = new Dictionary<StronglyTypedKey<string>>();
var result = dictionary[new("some key")];

The code above produces shared generic Dictionary<StronglyTypedKey<__Cannon>> type which seems fine at the first glance, but under the hood it always resolves EqualityComparer<T>.Default since the condition typeof(TKey).IsValueType is always satisfied and the _comparer field isn't set. That resolution happens inside of devirtualized GetHashCode and Equals methods of StronglyTypedKey<T>.

Method                              N   Mean     Error   StdDev   Code Size
LookupTraditionalString                    1    19.302 ns 0.0387 ns 0.0362 ns   401 B
LookupTraditionalWithStronglyTypedKeyString          1    38.275 ns 0.0404 ns 0.0378 ns   792 B
LookupTraditionalWithStronglyTypedKeyStringWithCustomComparer 1    31.926 ns 0.0499 ns 0.0467 ns   614 B
                                                              
LookupTraditionalString                    10    16.565 ns 0.0393 ns 0.0368 ns   401 B
LookupTraditionalWithStronglyTypedKeyString          10    37.746 ns 0.0299 ns 0.0280 ns   792 B
LookupTraditionalWithStronglyTypedKeyStringWithCustomComparer 10    32.230 ns 0.3138 ns 0.2935 ns   614 B
                                                             
LookupTraditionalString                    100   17.792 ns 0.0362 ns 0.0339 ns   410 B
LookupTraditionalWithStronglyTypedKeyString          100   38.961 ns 0.0492 ns 0.0460 ns   792 B
LookupTraditionalWithStronglyTypedKeyStringWithCustomComparer 100   31.624 ns 0.1461 ns 0.1220 ns   614 B
                                                               
LookupTraditionalString                    1000   20.657 ns 0.1705 ns 0.1511 ns   410 B
LookupTraditionalWithStronglyTypedKeyString           1000   40.340 ns 0.0433 ns 0.0405 ns   792 B
LookupTraditionalWithStronglyTypedKeyStringWithCustomComparer 1000   32.797 ns 0.1520 ns 0.1422 ns   614 B
                                                                 
LookupTraditionalString                     10000 18.818 ns 0.0832 ns 0.0778 ns   410 B
LookupTraditionalWithStronglyTypedKeyString           10000 39.701 ns 0.0759 ns 0.0710 ns   792 B
LookupTraditionalWithStronglyTypedKeyStringWithCustomComparer 10000 32.882 ns 0.0286 ns 0.0253 ns   620 B

Proposals

It is neither a Dictionary<TKey, TValue>'s fault, nor the runtime's. It's just an edge case, but which can e correctly handled anyway.

I see two options:

  • The JIT should avoid resolution of EqualityComparer<T>.Default to match a devirtualized code, and use T instead since there's only one comparer for T anyway and it never changes during the execution. And only if no devirtualized block exists for T, fallback to EqualityComparer<T>.Default.

  • Add a method to the RuntimeHelpers type which will return if the T is a shared generic or not, and then use it instead of typeof(T).IsValueType for hash-based collections.

@YohDeadfall YohDeadfall added the tenet-performance Performance related issue label Jun 11, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 11, 2024
@EgorBo
Copy link
Member

EgorBo commented Jun 11, 2024

The JIT should avoid resolution of EqualityComparer.Default to match a devirtualized code, and use T instead since there's only one comparer for T anyway and it never changes during the execution. And only if no devirtualized block exists for T, fallback to EqualityComparer.Default.

JIT is not allowed to make such assumptions, you might construct/load a new type dynamically

The issue here is that FindValue in Dictionary is never inlined so it doesn't know that T is statically known and sees _Canon instead. It should not be a problem for Dynamic PGO in theory as it's expected to devirtualize it anyway, but it doesn't happen currently due to #44610

Basically, the minimal repro is:

[MethodImpl(MethodImplOptions.NoInlining)]
static bool Test<T>(StronglyTypedKey<T> a, StronglyTypedKey<T> b) =>
    EqualityComparer<StronglyTypedKey<T>>.Default.Equals(a, b);

public readonly record struct StronglyTypedKey<T>(T Value);

@EgorBo EgorBo added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jun 11, 2024
@EgorBo EgorBo added this to the Future milestone Jun 11, 2024
@YohDeadfall
Copy link
Contributor Author

JIT is not allowed to make such assumptions, you might construct/load a new type dynamically

So, but if that type cannot be unloaded then that option should be possible, right? There's always a fallback for anything which cannot be devirtualized or assumed correctly.

It should not be a problem for Dynamic PGO in theory as it's expected to devirtualize it anyway, but it doesn't happen currently due to #44610

Could you explain how is it related to indirect calls? Not into all the internals, so cannot understand how the described problem by Andy is related to that.

Basically, the minimal repro is:

Technically, yes, but I mentioned Dictionary<TKey, TValue> as an example where an optimization happens in the constructor:

if (!typeof(TKey).IsValueType)
{
_comparer = comparer ?? EqualityComparer<TKey>.Default;
// Special-case EqualityComparer<string>.Default, StringComparer.Ordinal, and StringComparer.OrdinalIgnoreCase.
// We use a non-randomized comparer for improved perf, falling back to a randomized comparer if the
// hash buckets become unbalanced.
if (typeof(TKey) == typeof(string) &&
NonRandomizedStringEqualityComparer.GetStringComparer(_comparer!) is IEqualityComparer<string> stringComparer)
{
_comparer = (IEqualityComparer<TKey>)stringComparer;
}
}
else if (comparer is not null && // first check for null to avoid forcing default comparer instantiation unnecessarily
comparer != EqualityComparer<TKey>.Default)
{
_comparer = comparer;
}

Therefore, another possible solution for that problem is an intrinsic method which can be treated by the JIT as constant like typeof(T).IsValueType. Otherwise, it can be done purely in C# utilizing a static readonly field which value can be inlined and used for discarding a branch.

public static class RuntimeHelpers
{
    public static bool IsSharedGeneric<T>() => IsSharedGenericHolder<T>.Value;

    private static class IsSharedGenericHolder<T>
    {
        private static readonly bool Value = IsSharedGeneric(typeof(T));
    }

    private static bool IsSharedGeneric(Type type)
    {
        if (type.IsGeneric)
            foreach (var argument in type.GetGenericArguments())
            {
                if (!type.IsValueType || IsSharedGeneric(argument))
                    return true;
            }

        return false;
    }
}

Then RuntimeHelpers.IsSharedGeneric<TKey>() can be used instead of !typeof(TKey).IsValueType to force usage of the default equality comparer for TKey as instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

2 participants