-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement simple version of On Stack Replacement (OSR) #32969
Conversation
Add support to runtime and jit to allow switching from unoptimized to optimized code for a method while the method has active stack frames. Details in the included document.
cc @dotnet/jit-contrib @jkotas @noahfalk Not "in plan" for 5.0, but would like to get this into the main code base so we can refine and gain experience over time, similar to what we did with tiered compilation. Currently built by default (for x64), but not enabled by default. There should be minimal impact when not enabled (one extra byte in debug info per method). The plan is to create a special CI leg to enable ongoing testing. To enable one must set When enabled, methods with loops get jitted at Tier0 and get extra code for patchpoints; cost is about 2% of Tier0 code size, and also get patchpoint info. Net size impact is estimated to be around 10 bytes of code and data per Tier0 method (~100 bytes for each method with patchpoints). Many of the file changes are boilerplate. Interesting areas to focus on:
More thinking about trigger policies is needed; what's there now is plausible but has some quirks. I have run through scenarios reported vs 3.0 where QJFL=1 caused steady-state perf losses, and OSR recovers those. I have also run through scenarios where QJFL=0 caused startup losses, and OSR recovers those. NYI for non-x64. arm64 would probably come next, and might be challenging given the diversity of stack frame styles. Passes Tier1 tests on x64 in "aggressive" OSR mode (produce OSR method and transition the first time a patchpoint is reached). |
@@ -552,6 +574,20 @@ void CompressDebugInfo::RestoreBoundariesAndVars( | |||
if (pcVars != NULL) *pcVars = 0; | |||
if (ppVars != NULL) *ppVars = NULL; | |||
|
|||
// Check flag byte and skip over any patchpoint info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is changing the debug info format. The debug info format is part of R2R format, so changing it would need to touch number of places (e.g. src\coreclr\src\tools\crossgen2\ILCompiler.Reflection.ReadyToRun\DebugInfo.cs).
It may be easiest to append the extra stuff at the end, and look for it only when the method has patch info (e.g. only for JITed methods).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let me rework this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have looked at adding patchpoint info to the end of the compressed debug info blob, with the idea that some of the existing debug info producers/consumers would not need to change, but this seems fragile.
The current header record for the debug info just has the lengths of the two encoded sub-records. So there's no way to tell if what lies beyond the sub records is really patchpoint info or just random bytes.
Modifying the header to include more length info is similar in spirit to what I'm already doing. Adding a new header field -- either for overall length, for header length, or for patchpoint info length -- will require all producers and consumers to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any requirement that R2R debug format exactly match runtime format, though it currently does. We could allow them to diverge so that the runtime format supports the patchpoint data and the R2R one stays exactly how it is now. In the future they could again converge if/when we want to take an R2R format change.
Implementation-wise it appears minimal - a boolean flag on RestoreBoundariesAndVars indicating if the patchpoint length byte is expected, EEJitManager::GetBoundariesAndVars() sets it true, the other JitManagers set it false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated so only jitted code debug is impacted, and only when feature is defined.
Still may want to tweak this, say by putting the patchpoint info length in the header instead of a flag byte.
Please add exactly one area label to PR if bot fails to: it trains bot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions inline but this looked pretty good to me (I didn't look at the JIT parts). I'm happy to take a look at it again later after changes but don't feel that it has to block on me.
src/coreclr/src/inc/corinfo.h
Outdated
// Total size of this patchpoint info record, in bytes | ||
unsigned PatchpointInfoSize() const | ||
{ | ||
return m_patchpointInfoSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: low hanging memory overhead optimization, we could eliminate m_patchpointInfoSize and write this as ComputeSize(m_numberOfLocals).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -552,6 +574,20 @@ void CompressDebugInfo::RestoreBoundariesAndVars( | |||
if (pcVars != NULL) *pcVars = 0; | |||
if (ppVars != NULL) *ppVars = NULL; | |||
|
|||
// Check flag byte and skip over any patchpoint info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any requirement that R2R debug format exactly match runtime format, though it currently does. We could allow them to diverge so that the runtime format supports the patchpoint data and the R2R one stays exactly how it is now. In the future they could again converge if/when we want to take an R2R format change.
Implementation-wise it appears minimal - a boolean flag on RestoreBoundariesAndVars indicating if the patchpoint length byte is expected, EEJitManager::GetBoundariesAndVars() sets it true, the other JitManagers set it false.
src/coreclr/src/vm/jithelpers.cpp
Outdated
// Request a new native version that is optimized. | ||
CodeVersionManager::TableLockHolder lock(codeVersionManager); | ||
ILCodeVersion ilCodeVersion = codeVersionManager->GetILCodeVersion(pMD, rejitId); | ||
HRESULT hr = ilCodeVersion.AddNativeCodeVersion(pMD, NativeCodeVersion::OptimizationTier1, &osrNativeCodeVersion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TieredCompilationManager promotion to tier1 first checks for the existence of a OptimizationTier1 code version before creating a new one. If one already exists then we assume that tier1 promotion is in progress and doesn't need to be repeated. This native code version will incorrectly cause normal Tier1 promotion to back off. Assuming that the check is modified to only consider non-OSR tier1 compilations then there remains a race-condition where this NativeCodeVersion appears to be a non-OSR version until the SetOSRInfo call below. I'd suggest making OSR info part of immutable NativeCodeVersion information that is written only once in the constructor. AddNativeCodeVersion would need to be refactored and SetOSRInfo removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the check mentioned by @noahfalk and since the OSR code appears to be specific to each patchpoint, it may prevent a later normal transition to full tier 1 method code. The OSR code could have a different tier identifier but for now it may work to classify it as OptimizationTier0
instead.
Adding the code version to this collection is also a bit odd because it's not a version of code for the method, and it looks like there are already separate data structures tracking OSR code versions. I don't think it hurts much, for example the debugger would show OSR code versions as well. But I'm curious if a code version needs to be added at all, is it used somehow and if so could those cases use the other data structures instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is also a bit odd because it's not a version of code for the method
I suggest that we do think of OSR compilations as 'a version of code for the method'. It is a version that has a different entrypoint and ABI than we are used to, but for the majority of code interacting with CodeVersionManager that isn't an important distinction.
for example the debugger would show OSR code versions as well
That is a good thing : ) The debugger, profiler, and ETW should all be able to enumerate OSR code bodies as potential code regions that resolve to the same method.
I'm curious if a code version needs to be added at al
I believe it is an apt representation:
a) It is the identifier used by PrepareCode that identifies the code that needs to be created.
b) It allows the debugger, profiler, and ETW to enumerate the code regions
if so could those cases use the other data structures instead?
I'm sure it is possible, but the code complexity added to the patchpoint table, in PrepareCode, and in all the code version enumerators goes up significantly when we don't use NativeCodeVersion as a common abstraction. The complexity added to NativeCodeVersion to let it represent an OSR version appears far lower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complexity added to NativeCodeVersion to let it represent an OSR version appears far lower.
Maybe not worth the trouble, sounds fine to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am going to stick with a new native code version for now. This is no longer described as Tier1 -- it's now Tier1OSR, so should not confuse tiering...
src/coreclr/src/vm/jithelpers.cpp
Outdated
|
||
// OSR is only supported on x64, currently. | ||
#if !defined(TARGET_AMD64) | ||
if (verbose > 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (verbose > 1) | |
#error OSR in not yet implemented for ABIs other than AMD64 yet |
src/coreclr/src/vm/jithelpers.cpp
Outdated
{ | ||
if (verbose > 1) | ||
{ | ||
printf("### Runtime: INVALID patchpoint [%d] 0x%p\n", ppId, ip); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG(...) macro or STRESS_LOG(...) macro are prefered error logging mechanisms over printf. Log levels should also eliminate the need for manual verbosity checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -973,10 +973,13 @@ PCODE MethodDesc::JitCompileCodeLocked(PrepareCodeConfig* pConfig, JitListLockEn | |||
|
|||
// The profiler may have changed the code on the callback. Need to | |||
// pick up the new code. | |||
// | |||
// (don't want this for OSR, need to see how it works) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Profilers achieve this modification using either SetILFunctionBody or RequestReJIT. The ReJIT case has its safety enforced by the runtime - each time the profiler calls RequestReJIT we allocate a new ILCodeVersion and every JIT request is bound to a particular NativeCodeVersion which in turn implies a specific ILCodeVersion. In the case of tiered compilation or OSR the NativeCodeVersion being constructed was bound to the same ILCodeVersion as the tier0 native code, regardless if the profiler has created some new IL versions in the meantime. For ReJIT GetAndVerifyILHeader() allows the IL code for a given ILCodeVersion to be provided by the profiler lazily, but it is still logically immutable. Whatever IL the profiler gives the runtime is cached in the ILCodeVersion object and that same IL is returned on every future query. We know that the tier0 code would have already requested IL so all the tier1 requests will be working off the runtime cached copy.
SetILFunction body is the older and less safe variation that currently relies on correct usage by profiler authors rather than runtime enforcement. The documentation tells them that they should not do this:
> The SetILFunctionBody method can be called on only those functions that have never been compiled by a just-in-time (JIT) compiler.
The current implementation of SetILFunctionBody overrides the IL of the initial ILCodeVersion. The runtime code treats that IL as if it were immutable, but clearly with abuse by the profiler author it wouldn't be. If a profiler is making these types of modifications and leaving tiered compilation enabled then they have already broken the tiered compilation case (tier0 and tier1 would use different IL). I could imagine that getting it wrong in the OSR case has more visble failure modes though. I haven't thought it through all the way, but probably a safer place to be in the future is to re-implement SetILFunctionBody so that it allocates new ILCodeVersions similar to ReJIT so that we can version it explicitly.
cc @davmason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback so far. Since this involves reworking the jit interface and that entails lots of coordinated edits, I'd like to get consensus on the overall structure, if possible. Here's what I propose given the suggestions above:
|
Sounds good to me.
We may want to follow similar patterns for this as we follow for GCInfo. |
You mean adding the definition for |
I meant it more a conceptual comment: GCInfo is opaque (compressed) blob. JIT has encoder for it, and the parts of the system that need to understand it have a decoder for it. Now that you are describing |
That was already my conceptual viewpoint too. If moving the current |
We use The structures used to pass the data throughout the system do not have |
Ok, good. So should it just be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly comments on comments, but one question about struct promotion being disabled.
(I've only looked at the document and the JIT code).
assert(info.compILEntry >= 0); | ||
BasicBlock* bbTarget = fgLookupBB(info.compILEntry); | ||
|
||
fgEnsureFirstBBisScratch(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding a call to fgFirstBBisScratch()
(which has some self-consistency asserts) in the phase checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do.
The name doesn't quite convey the idea that this is method is nominally a "side effect free" predicate. Perhaps fgIsFirstBBScratch()
would be an improvement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that doesn't play out as well as I thought. Will leave name as is.
// We need to trim the current try to begin at a different block. Normally | ||
// this would be problematic as we don't have enough context to redirect | ||
// all the incoming edges, but we know oldTryEntry is unreachable. | ||
// So there are no incoming edges to worry about. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused here as well. If there are no incoming edges, then why can't we just eliminate this try entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally you can't branch into the middle of a try; OSR changes this.
So we can have cases now where we just import the middle of a try, and need to trim both begin and end extents.
If a try entry is not imported, then none of the try entry predecessors could have been imported either (even though we can't enumerate them), so there are no "flowgraph live" references to oldTryEntry
.
I do not have strong opinion. |
src/coreclr/src/vm/jithelpers.cpp
Outdated
|
||
// Find the il method version corresponding to this version of the method | ||
// and set up a new native code version for it. | ||
ReJITID rejitId = ReJitManager::GetReJitId(pMD, codeInfo.GetStartAddress()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is code-versioning-specific and not profiler-rejit-specific, it may be better to use a different route. EECodeInfo
now has a GetNativeCodeVersion()
and from that the IL code version can be obtained (inside the lock). Both can be done inside the code versioning lock since the lock allows reentrancy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this is now following your suggestion.
src/coreclr/src/vm/jithelpers.cpp
Outdated
NativeCodeVersion osrNativeCodeVersion; | ||
{ | ||
// Request a new native version that is optimized. | ||
CodeVersionManager::TableLockHolder lock(codeVersionManager); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would conflict with my recent change that removed TableLockHolder
, this is used instead in other places:
CodeVersionManager::LockHolder codeVersioningLockHolder;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
src/coreclr/src/vm/jithelpers.cpp
Outdated
// Request a new native version that is optimized. | ||
CodeVersionManager::TableLockHolder lock(codeVersionManager); | ||
ILCodeVersion ilCodeVersion = codeVersionManager->GetILCodeVersion(pMD, rejitId); | ||
HRESULT hr = ilCodeVersion.AddNativeCodeVersion(pMD, NativeCodeVersion::OptimizationTier1, &osrNativeCodeVersion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the check mentioned by @noahfalk and since the OSR code appears to be specific to each patchpoint, it may prevent a later normal transition to full tier 1 method code. The OSR code could have a different tier identifier but for now it may work to classify it as OptimizationTier0
instead.
Adding the code version to this collection is also a bit odd because it's not a version of code for the method, and it looks like there are already separate data structures tracking OSR code versions. I don't think it hurts much, for example the debugger would show OSR code versions as well. But I'm curious if a code version needs to be added at all, is it used somehow and if so could those cases use the other data structures instead?
@@ -351,6 +351,9 @@ End | |||
Crst JitGenericHandleCache | |||
End | |||
|
|||
Crst JitPatchpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe could reuse CrstLeafLock
instead, as there appear to be are many different locks in that class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can no longer be a leaf lock as we have to allocate while holding.
It is only used to convey the extra information the jit needs for an OSR jit request. I could probably pass this information via the |
Updated that bit. Looks like the crossgen comparison diff may be real, need to investigate. |
Can't get the CoreCLR Pri0 Test Run Linux x64 checked failure to repro locally. |
Seems like similar failures seen in #33562. I still cannot repro the linux checked failures locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few minor comments.
src/coreclr/src/vm/jithelpers.cpp
Outdated
// Invoke the helper to build the OSR method | ||
osrMethodCode = HCCALL3(JIT_Patchpoint_Framed, pMD, codeInfo, ilOffset); | ||
|
||
// If that failed, mark the patchpoint as invaild. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo: "invaild"
src/coreclr/src/jit/flowgraph.cpp
Outdated
@@ -497,7 +498,7 @@ void Compiler::fgEnsureFirstBBisScratch() | |||
} | |||
|
|||
//------------------------------------------------------------------------ | |||
// fgFirstBBisScratch: Check if fgFirstBB is a scratch block | |||
// fgIsFirstBBScratch: Check if fgFirstBB is a scratch block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intend to rename the function as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed it and then decided to keep the original name. Missed undoing this one.
src/coreclr/src/jit/flowgraph.cpp
Outdated
@@ -5983,6 +5989,29 @@ void Compiler::fgFindBasicBlocks() | |||
return; | |||
} | |||
|
|||
// If we are doing OSR, add an entry block that simply branches to the right IL offset. | |||
// Might need to do something special if we're entering try regions? | |||
if (opts.jitFlags->IsSet(JitFlags::JIT_FLAG_OSR)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=> opts.IsOSR()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(every usage of JIT_FLAG_OSR
should go through that helper, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will update.
@@ -20752,6 +20911,7 @@ void Compiler::fgDebugCheckBBlist(bool checkBBNum /* = false */, bool checkBBRef | |||
#endif // DEBUG | |||
|
|||
fgDebugCheckBlockLinks(); | |||
fgFirstBBisScratch(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we always require the first BB to be scratch? Is that a new requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.
That method is called here just as a check. It verifies that if there is a scratch bb, that other expected invariants hold.
@@ -394,6 +394,11 @@ CONFIG_INTEGER(JitGuardedDevirtualizationGuessUniqueInterface, W("JitGuardedDevi | |||
CONFIG_INTEGER(JitGuardedDevirtualizationGuessBestClass, W("JitGuardedDevirtualizationGuessBestClass"), 1) | |||
#endif // DEBUG | |||
|
|||
// Enable insertion of patchpoints into Tier0 methods with loops. | |||
CONFIG_INTEGER(JitPatchpoint, W("JitPatchpoint"), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you enable OSR by setting COMPlus_JitPatchpoint=1
? Seems like "OSR" and/or "On Stack Replacement" wording should be here, and maybe make it clear that this is the "root of all enablement", if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I should rename this to COMPlus_TC_OnStackReplacement
or similar.
You also need to set COMPlus_TC_QuickJitForLoops=1
to turn this on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to set COMPlus_TC_QuickJitForLoops=1 to turn this on.
Could (should) you make COMPlus_TC_OnStackReplacement
automatically set COMPlus_TC_QuickJitForLoops=1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could at least document that requirement here.
src/coreclr/src/jit/lclvars.cpp
Outdated
|
||
if (opts.IsOSR() && info.compPatchpointInfo->IsExposed(varDscInfo->varNum)) | ||
{ | ||
JITDUMP("-- V%02u is osr exposed\n", varDscInfo->varNum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "osr" should be all-caps "OSR" in all output (and comments)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will fix.
Need to wait for fix in #33580 before retesting. |
You added a bunch of tests. Should you force them to be OSR tests by having their project files:
For example, the way Or do you want to leave OSR testing to a special "off by default" pipeline as we've discussed? |
Yes, might as well. Also need to suppress patchpoint insertion if the OSR feature is not defined. |
Think we're close enough now that this can come out of draft mode. Looks like I need to resolve another merge conflict. |
@echesakovMSFT any tips for debugging the crossgen comparsion failure? Also looks like that CI leg is archiving an entire repo + artifacts, 4GB of stuff; not sure if that's useful or unintentional. |
It's failing crossgen-comparison with OSR disabled? With OSR disabled shouldn't R2R images be the same as before your PR? |
Right, this should be no diff. Have tried looking at the changes themselves to see if anything jumps out, but didn't spot anything. Will probably try looking for x64_arm crossgen diffs next, or grab the crossgen files and see where the diffs appear (R2R dump should work for this, right?). [Edit: looks like we don't publish the crossgen files, there are empty "a" and "b" folders in the artifacts that I though might hold the binaries]. |
I would start with running crossarch compiler on the libraries that have mistmatch
Then you need an arm machine to run native crossgen. Then you would compare the resulting images with some kind of binary diff tool (I am using https://www.cjmweb.net/vbindiff/) to identify places where images are different. Make sure to ignore image checksums - I don't remember an exact file offset but it appears earlier in a PE image. As soon as you identify a first point of real differences - use R2RDump to identify a corresponding section. In you case, there are any images that differ - so it should be something fundamental and easy to identify. |
@echesakovMSFT thanks; working on doing just that. It sure would be nice, though, if the CI leg saved off at least one of the pairs of R2R images with diffs. I am more interested in things I that should look for in my source changes that might lead to diffs. Have reviewed all the jit changes several times and don't see anything. I suppose it could be something in the runtime changes, maybe the changes to the layout of the debug blob, but none of that should be turned on in crossgen. |
Think this is ready to merge... |
@AndyAyersMS I've just found this PR has a bug. It has modified the managed CORINFO_METHOD_INFO, but is has left the native counterpart untouched: runtime/src/coreclr/src/inc/corinfo.h Lines 1174 to 1186 in 94d35c3
This leads to overwriting of stack locals in crossgen2 in the call runtime/src/coreclr/src/jit/importer.cpp Line 18442 in 94d35c3
It zeroes the native struct, but since the managed counterpart is larger, it ends up zeroing some data after the end. |
@AndyAyersMS Did you see @janvorli's comment here? Did it get fixed? |
Yes. Fixed by #34134. |
Add support to runtime and jit to allow switching from unoptimized to
optimized code for a method while the method has active stack frames.
Details in the included document.