-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crossgen2 support for static virtual method resolution (take 2) #87438
Conversation
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Does it make sense to also run crossgen outerloop jobs? |
Thanks Egor for the suggestion, I'll run them once the primary runs finish fine to reduce lab cost in case some problem pops up. In general I have recently found out that Crossgen2 outerloop runs are currently busted due to several problematic IL tests so I don't expect a green run but a canary run will certainly be useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced any changes to MetadataVirtualMethodAlgorithm.cs are needed.
src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs
Outdated
Show resolved
Hide resolved
src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs
Outdated
Show resolved
Hide resolved
src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs
Outdated
Show resolved
Hide resolved
f870aa8
to
9ff9691
Compare
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr crossgen2 outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
@MichalStrehovsky - based on your suggestion I have reverted all changes to Without this extra change Crossgen2 crashes in several SVM tests you originally authored including |
else if (isStaticVirtual) | ||
{ | ||
pResult->thisTransform = CORINFO_THIS_TRANSFORM.CORINFO_NO_THIS_TRANSFORM; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it actually possible to reach this code? My reading indicates that either the directMethod will be resolved to a non-null value above, or we will throw a RequiresRuntimeJitException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reformulated this bit and I believe it's now reachable.
directMethod = constrainedType.ResolveVariantInterfaceMethodToStaticVirtualMethodOnType(originalMethod); | ||
if (directMethod == null) | ||
{ | ||
throw new RequiresRuntimeJitException(originalMethod.ToString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test to the version resiliency tests in src/tests/readytorun/tests so that if the static interface, and implementing type are not in the current R2R module, that if the implementation changes from a default implementation to an exact implementation that the generated R2R code referring to such a thing remains correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in the latest commit, please take a look to double-check if it matches your intent.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs
Outdated
Show resolved
Hide resolved
src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs
Outdated
Show resolved
Hide resolved
9ff9691
to
a84ada5
Compare
@@ -995,6 +995,13 @@ private static MethodDesc TryResolveVirtualStaticMethodOnThisType(MetadataType c | |||
if (methodImpl.Decl == interfaceMethodDefinition) | |||
{ | |||
MethodDesc resolvedMethodImpl = methodImpl.Body; | |||
if (resolvedMethodImpl.OwningType != constrainedType.GetTypeDefinition()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the managed type system does a lot less validation than the CoreCLR type system in general - there are many other things that could be wrong with the type and prevent a load on CoreCLR that we'll not detect here. The way this is approached elsewhere is that we emit enough fixups to force this load on the CoreCLR side. I guess adding a check fixes some test, but it's not systematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of the problem would be:
Call<Foo>();
static void Call<T>() where T : IFoo => T.Something();
interface IFoo
{
static virtual void Something() => Console.WriteLine("Hello");
}
struct Foo : IFoo
{
extern void BreakThings();
}
Here CoreCLR will not allow Foo
to load because it has an extern method with no body (just a random example of things CoreCLR looks at). If we generate code in crossgen2 that bypasses the need to load Foo
(which I believe we'd do here, but I didn't actually check - I'm certain it can be tweaked to achieve that), we'd "fix" this even though it's not supposed to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, can you please explain to me why the problem you described is specific to static virtual methods? I mean, let's say Crossgen2 allows JIT to inline a call to a static method - we're also not making sure on the Crossgen2 side that the callee sits in a loadable class that doesn't have any extern methods defined and whatnot. You may be right that it's problematic in certain corner cases and we should improve the type validation but I just don't see why this should be a SVM-specific problem; if a change in this logic turns out to be necessary, I tend to assume it should apply to all methods, not just static virtuals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's specific to static virtual methods. It is a general problem. I think the throw/check added here is just enough to fix a failing test, but it doesn't even fix this problem for bad MethodImpls, let alone for the many other cases when CoreCLR would refuse to load the type. E.g. if there's a broken MethodImpl for some other method but it's after we already found a match, we'll not notice and will not throw. CoreCLR will.
The preferred solution for this has always been to generate a fixup to force CoreCLR to load this type before we're allowed to run the code.
If a fixup is not possible, we're basically saying we'll need a CoreCLR-like type load emulator that will validate this whenever we touch a type. If that's the case, we should disable the test on ActiveIssue and budget for a CoreCLR-load emulator in .NET 9.
Adding partial checks in places that our tests happened to run into doesn't seem sustainable to me. E.g. we iterate MethodImpls in all of these places. Should each of these validate the MethodImpl is valid? And what about all the other scenarios where CoreCLR will refuse a load that we just happen not to have test coverage with static virtuals?
runtime/src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs
Lines 287 to 301 in cdd7566
foreach (MethodImplRecord record in foundMethodImpls) | |
{ | |
MethodDesc recordDecl = record.Decl; | |
if (interfaceDecl != recordDecl.OwningType.IsInterface) | |
continue; | |
if (!interfaceDecl) | |
recordDecl = FindSlotDefiningMethodForVirtualMethod(recordDecl); | |
if (recordDecl == decl) | |
{ | |
return FindSlotDefiningMethodForVirtualMethod(record.Body); | |
} | |
} |
runtime/src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs
Lines 492 to 515 in cdd7566
foreach (MethodImplRecord methodImplRecord in currentType.VirtualMethodImplsForType) | |
{ | |
MethodDesc declSlot = FindSlotDefiningMethodForVirtualMethod(methodImplRecord.Decl); | |
MethodDesc implSlot = FindSlotDefiningMethodForVirtualMethod(methodImplRecord.Body); | |
if (unificationGroup.IsInGroup(declSlot) && !unificationGroup.IsInGroupOrIsDefiningSlot(implSlot)) | |
{ | |
unificationGroup.RemoveFromGroup(declSlot); | |
separatedMethods ??= new MethodDescHashtable(); | |
separatedMethods.AddOrGetExisting(declSlot); | |
if (unificationGroup.RequiresSlotUnification(declSlot) || implSlot.RequiresSlotUnification()) | |
{ | |
if (implSlot.Signature.EqualsWithCovariantReturnType(unificationGroup.DefiningMethod.Signature)) | |
{ | |
unificationGroup.AddMethodRequiringSlotUnification(declSlot); | |
unificationGroup.AddMethodRequiringSlotUnification(implSlot); | |
unificationGroup.SetDefiningMethod(implSlot); | |
} | |
} | |
continue; | |
} |
runtime/src/coreclr/tools/Common/TypeSystem/Common/MetadataVirtualMethodAlgorithm.cs
Lines 794 to 816 in cdd7566
MethodImplRecord[] possibleImpls = runtimeInterface.FindMethodsImplWithMatchingDeclName(interfaceMethod.Name); | |
if (possibleImpls != null) | |
{ | |
foreach (MethodImplRecord implRecord in possibleImpls) | |
{ | |
if (implRecord.Decl == interfaceMethodDefinition) | |
{ | |
// This interface provides a default implementation. | |
// Is it also most specific? | |
if (mostSpecificInterface == null || Array.IndexOf(runtimeInterface.RuntimeInterfaces, mostSpecificInterface) != -1) | |
{ | |
mostSpecificInterface = runtimeInterface; | |
impl = implRecord.Body; | |
diamondCase = false; | |
} | |
else if (Array.IndexOf(mostSpecificInterface.RuntimeInterfaces, runtimeInterface) == -1) | |
{ | |
diamondCase = true; | |
} | |
break; | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I've written the bare bones of a CoreCLR based type load emulator as part of supporting the skip type validation feature of R2R, but its incomplete, and does not cover all of the failure paths (It handles most of the checks that skip type validation skips). I see that we ALSO have a different validator that is used in NativeAOT/Crossgen2 in other cases in the jit interface. We really ought to unify the two and finish them up in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that we ALSO have a different validator that is used in NativeAOT/Crossgen2 in other cases in the jit interface
That validator was originally written for NativeAOT purposes - we have a more relaxed approach to invalid inputs there and the validator basically just does enough validation to prevent a compiler crash later (type system exceptions are catchable/handlable only when hit while generating method code and fatal in most other places).
Crossgen2 needs to be more strict because "fixing" a problem initially, but resurfacing it again when the method is rejitted into a higher tier is worse than the NativeAOT behavior (that just doesn't throw the exception it was supposed to throw and leaves things at that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both R2R and tiered JIT assume that the IL is valid. Invalid IL can have behavior differences between R2R and JIT or between Tier0 JIT and Tier1 JIT. The AOT compiler validator does not need to be perfect. Its main purpose is to catch missing dependencies that are very common in the code out there. We have tacked some invalid metadata handling to it too, but that is mostly just to make some of our own tests for invalid patterns happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we view it as "just a thing to make a test pass", I'm fine with adding a check here.
However, adding a new ExceptionStringId requires more work than just a new enum member. NativeAOT compiler will potentially catch this exception at compile time and rethrow it at runtime, generating an exception message using CoreLib's localization.
Also the exception type and exception message should match CoreCLR's exception (that's the pattern we've been following).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I generally agree it's a good idea to emit a fixup to make sure the target type of a SVM gets loaded, I'm worried that that this particular case is somewhat more tricky - if Crossgen2 silently ignores "explicit override forwarders" in the sense of allowing the SVM implementation to be redirected to a completely different type, its load check may happily pass as it would be completely oblivious of the original "intermediate type" that spawned the forward; otherwise we would probably need to encode fixups for all three involved types - the original interface type where the SVM is defined, the targeting constrained type and the actual resolution type - and that seems somewhat wasteful to me and potentially cancelling out the SVM prejit perf savings due to more work that needs to happen at runtime.
1. Delegate cctor helper JIT interface method was missing support for type constraint. 2. Methods wrapped in instantiation / unboxing stubs shouldn't claim they have a generic dictionary slot in their GC refmap. Thanks Tomas
* Added stricter check to delegate constructor * Disabled MethodBodyOnUnrelatedType test in Crossgen2 mode * Reverted change to virtual method resolver Thanks Tomas
f1fb1fe
to
1374c32
Compare
@mangod9 - As discussed offline I traditionally hit a snag with Crank that now seems to have trouble using locally built apps. I'll follow up with Sebastién, for now I have dusted off our good old friend Jellyfin and I think I am able to see some runtime JIT reduction with this change - for default R2R publishing 453 methods get jitted at runtime before the app crashes; in composite publishing it gets reduced to 341 runtime-jitted methods and further to 252 methods when using composite built with this Crossgen2 change. While that's just an anecdotal observation, I think it shows the potential for some startup perf improvement; I have rebased this change against the latest main and I plan to retest it and merge it in today around noon unless anyone objects. |
just confirming that the app crashes are what is being fixed here or is that unrelated? Nice that you are observing less JITing which should be good to merge ( assume you are observing more methods emitted in the R2R image as well?). Thx! |
@mangod9 - of course not, it crashes in all build modes, I just didn't bother to fix all of it, we never had a fully functional version supporting .NET 8. In the meantime I have also tried "dotnet new webapi" in composite mode without and with the MIBC data. Without MIBC data, the "normal" version jits 200 methods at runtime (including shutdown), with this change it jits 179 methods. With the StandardOptimizationData.mibc, the normal version jits 75 methods at runtime and the version with this PR jits 53 methods. As you can see, the diffs are similar (about 20 methods) so that's presumably thanks to Crossgen2 being newly able to compile methods calling SVMs. |
For the methods emitted into the R2R file, I don't have an exact diff as the R2RDump diff mode has been broken for a while but I see that for "dotnet new webapi" the full composite PE executable with this change is slightly larger (100.4 MB vs 98.9 MB or 1.4 MB delta) and contains more generics (the instance method entrypoint section size is 4.8 MB vs. 4.74 MB or by about 60 KB larger); interestingly enough the number of non-generic methods is the same which is probably expected as I believe that currently the biggest user of SVMs in the framework is generic maths. |
Seeing a increase in code size for some generic maths, see #84421 (comment). |
I have revived my change from two years back, rebased it against current main and updated it based mostly on Michal's PR feedback. It seems to work in my local testing so I'm publishing the PR and starting lab testing. I originally hoped to revive my old PR
#54063
but GitHub has apparently already locked it so I have to open a new one.
Thanks
Tomas
/cc @dotnet/crossgen-contrib, @dotnet/jit-contrib