Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

WIP - Tiered Jitting Part Deux #12193

Merged
merged 1 commit into from
Jul 25, 2017
Merged

Conversation

noahfalk
Copy link
Member

@noahfalk noahfalk commented Jun 9, 2017

Do not merge - this is heads up only. This is a continuation of the tiered compilation work trying to establish .Net code versioning on generally more sound footing. In particular - resolving issues between ReJit and tiered compilation.

Please see code-versioning-pr-notes.md for info.

@noahfalk
Copy link
Member Author

noahfalk commented Jun 9, 2017

@janvorli @gkhanna79 @jkotas @sergign60 @davmason @mikem8361 @mjsabby @dotnet/jit-contrib @discostu105
You guys might want to look at varying parts of this that interest you?

I'm hoping I can get JanV to be overall code reviewer? (sorry I know its big and will be even bigger before its done : )

@noahfalk noahfalk force-pushed the tiered_jitting_m2 branch from 3f7837c to 84018d6 Compare June 21, 2017 10:38
@noahfalk
Copy link
Member Author

@janvorli @gkhanna79 @jkotas @sergign60 @davmason @mikem8361 @mjsabby @dotnet/jit-contrib @discostu105 - heads up, lots of updates

Folks writing profilers, be sure to check out the breaking change doc I added in my most recent commit.


## Likely ways you will see behavior change ##

1. There will be more JITCompilation events, and potentially more ReJIT compilation events than there were before.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are multiple JITCompilation events for the same method, is there any way of knowing the last if these (per method)? We currently pre-compute IL-code at ModuleLoadStarted, then keep it in memory until JITCompilationStarted, then exchange it and then delete it. If there will be subsequent JITCompilationStarted events, we cannot do this optimization anymore. Is it possible to know when there is the last-stage JIT event, after which there won't be any more.
Btw: we have not implemented re-jit yet. We'd probably have the same problem there anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you mean that you call ICorProfilerInfo::SetILFunctionBody to do that exchange? If you called this method the first time a method JITs (as you do today), the effect of this would persist automatically for all future jittings of that same method. If I understand you correctly the only thing you might need to change is to check in your JITCompilation handler if the IL code pointer is already NULL. If it is that means you already set the IL and deleted your copy in a previous event so you should avoid calling SetILFunctionBody again with your now deleted IL pointer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You assumed exactly right. Sounds like we will be fine.


2. These JIT events may originate from a background worker thread that may different from the thread which ulimately runs the jitted code.

3. Calls to ICorProfiler4::GetCodeInfo3 will only return information about the first jitted code body for a given FunctionID,rejitID pair. If profiler scenarios require an enumeration of every jitted code body a new API will need to be created and consumed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the goal of this that multiple profilers can manipulate the same method body in a way, that they build upon each others modifications?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point the goal is that 1 profiler + the runtime can build upon each other's configurations. So for example the profiler could insert an IL call instruction that gives a notification each time the method runs. The runtime could JIT that code a first time very quickly to get the thread running code unblocked, but later when it determines the method is hot it comes back to JIT a more optimized version of the same instrumented IL. This lets your instrumented IL not be at a performance disadvantage to IL that existed on-disk at the beginning of the process.

Its possible a future extension could let multiple profilers instrument the same method one after another, but its not in the short term plan at least.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks.

The whole change will bring down JIT-Overhead in total (because of cheaper first-version-JITs). This is good for profilers, as we often struggle with the overhead introduced by disallowing NGEN images.


3. Calls to ICorProfiler4::GetCodeInfo3 will only return information about the first jitted code body for a given FunctionID,rejitID pair. If profiler scenarios require an enumeration of every jitted code body a new API will need to be created and consumed.

4. A by-product of the current runtime changes is that IL supplied during the JITCompilationStarted callback is now verified the same as if you had provided it during ModuleLoadFinished. This verification might be beneficial, but if there was a useful reason to continue not verifying it I could restore the original no-verification behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of code-verification will this be? We have just worked on https://github.com/dotnet/corert/tree/master/src/ILVerify to achive IL code verification. Will this be similar?

If yes, I like it. But how severe will the performance impact be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through the code it appears to be just some very minimal verification that the signatures are well-formed. For example if you reference a token it ensures that the token exists in metadata, or if you claim a method has 3 locals then the parser should find 3 types in the local var sig. I did not notice any part of this codepath that was checking for type-safety. If you want to examine the details, the code path starts at MethodDesc::GetAndVerifyMetadataILHeader in prestub.cpp of this PR, or I'm happy to help with further questions.

I would expect the performance impact will be unchanged relative to the current runtime performance. Prior to my PR changes this verification code path was still running, but what happened is that we spent the CPU cycles verifying the IL that was loaded from disk, then in JITCompilationStarted you specify new IL so the runtime throws away the IL that was verified and accepts new IL from the profiler without re-running verification. The new code path simply switches the order of these two operations. First we get IL from the profiler, then we run the verification algorithm.

Copy link

@discostu105 discostu105 Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A great improvement. Inserting invalid IL code happens very easily, so a little support preventing that is awesome!

Might be a breaking change though. We had cases of emitting just slightly incorrect IL Code, which the CLR executed just fine, but a verification tool would point out (e.g. missing casts).

What would happen in case of such a verification-error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of verification error the thread attempting to JIT the method would throw a BadImageFormatException. Given how minimal the verification in this code path is I doubt it would catch anything that wouldn't also have broken the JIT, but if there is concern I can change the code to disable it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really a concern. IMHO failing as early as possible with invalid IL code is better.

@@ -0,0 +1,29 @@
# Code Versioning Profiler Breaking Changes #

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention code map? Are you fixing those for reJIT cases?

Copy link
Member Author

@noahfalk noahfalk Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I had missed this one, but yes the issue is similar to GetCodeInfo3 where the API only expects one code body. As you point out the API is already somewhat broken because it doesn't handle ReJIT code bodies either.

As a tentative proposal what if we create GetILToNativeMapping3 which takes a code start address rather than a FunctionID? It will take a little research to confirm, but I don't think there should be any significant obstacle to making it work correctly for all code bodies.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you saying that code map is broken after tiered re jit? Adding method is fine I think

Copy link
Member Author

@noahfalk noahfalk Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you saying that code map is broken after tiered re jit?

Yes, the end to end workflow would be broken. GetILToNativeMapping will give you a valid IL to Native map for the first code body which is jitted. However if the same method JITs a second time, the second code body may have a different native to il mapping than the first. The current API provides you no mechanism to access the native to il map that corresponds to that second code body.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is actually two questions in one. First, just want to confirm that exception stack will show the proper line number after tiered re-JIT optimized the method body. Second, can fixing of code map for re-JITed by profiler methods can be included in scope of this work.

For the second one - if it requires a new method - it is fine. Just allow to do it.

Copy link
Member Author

@noahfalk noahfalk Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, just want to confirm that exception stack will show the proper line number after tiered re-JIT optimized the method body

For the Exception.StackTrace issue that you mentioned I'm keeping it separate from this work. If you only use tiered jitting and never call RequestReJIT in the profiler then exception stack traces will work properly. If you do call RequestReJIT then the pre-existing bug may give you incorrect StackTrace results and that might still exist after this work is complete.

Second, can fixing of code map for re-JITed by profiler methods can be included in scope of this work

Yes, I will include fixing GetILToNativeMapping in this work. It may not be in this checkin, but I'll get a solution in place before tiered jitting is ever enabled by default.

@noahfalk noahfalk force-pushed the tiered_jitting_m2 branch 3 times, most recently from 8f2b3cc to 4725e32 Compare June 24, 2017 03:04
@noahfalk
Copy link
Member Author

@brianrob

1. A roadmap that allows all those features to interoperate if we made the effort to do so in the future.
2. Implement that design far enough so that Rejit + Tiered Compilation interoperate now.

For now tiered compilation can reasonably co-exist with the other features by disabling it only on the methods that are eligible for one of the other techniques.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, that means a profiler which does IL-Rewriting that uses Rejit, cannot work with Tiered Compilation? How would it not work? Would RequestReJIT fail? Or would it fail already when COR_PRF_ENABLE_REJIT is requested?

If COR_PRF_ENABLE_REJIT is not requested, Tiered Compilation works just fine?

How would Tiered Compilation be disabled? Is there a way for the profiler to do it (via bitmask)? Can it be disabled just for certain methods and how? What mode is used if it's disabled? Would it always produce highest-quality code (resulting in most CPU usage)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiered jitting will work with rejit, sorry for confusion.

The proposal text you are referencing was added early in the PR before most of the work was done and then moved to a new filename later. I plan to delete it when the PR is done. This text is giving a problem statement that the work in this PR is attempting to solve. So the state before the PR is that tiered jitting didn't work with rejit. After the PR is commited they will work together.

I think this moots the remainder of your questions but happy to dig in for anything that you still think is relevant. FWIW I was contemplating adding a tiered jitting opt-out in the profiler flags in the same way we've got NGEN opt-out. Any opt-out would presumably return you to the JIT behavior that existed before tiered jitting was added.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for clearing that up!

@noahfalk
Copy link
Member Author

noahfalk commented Jul 4, 2017

Just fyi, I created a number of work items for additional work we need to do to get tiered jitting off the ground (#12609, #12610, #12611, #12612, #12617). If your interest directly relates to one of those feel free to add your comments to them.

// manager's active IL version hasn't yet asked the profiler for the IL body to use, in which case we want to filter it
// out from the return in this method.
ILCodeVersion activeILVersion = pCodeVersionManager->GetActiveILCodeVersion(pModule, methodTk);
if (activeILVersion.IsNull() || activeILVersion.GetRejitState() != ILCodeVersion::kStateActive)
Copy link
Member

@brianrob brianrob Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, do we actually need to differentiate between rejit and non-rejit IL? I like the plan that you're on here of taking the rejit code and making it more general purpose, but would like it even more if we didn't have to have a concept of rejit specific storage. If I think about this a bit more, perhaps you're using the term rejit here to represent the fact that we might not be talking about IL that is in an assembly? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ICorDebug does have 1st class knowledge of ReJIT IL via APIs like ICorDebugFunction3::GetActiveReJitRequestILCode, so we do need to distinguish. At the moment ReJit is the only thing that can change the IL from its original state in the assembly, but if we wanted to add other IL modifying capabilities in the future I think we would need rethink whether treating each case specially was the right thing to do.

The reason the debugger cares about ReJit IL is that the debugger API can be operated by the same component that is operating the profiler API. It is legal to have the profiler create new IL code that has new local variables, then the debugger views a dump of that code and assigns special meaning to profiler added locals.


In reply to: 127300842 [](ancestors = 127300842)

ILCodeVersion activeILCodeVersion = pCodeVersionManager->GetActiveILCodeVersion(pMD);
NativeCodeVersion activeChild = activeILCodeVersion.GetActiveNativeCodeVersion(pMD);
NativeCodeVersionCollection nativeCodeVersions = activeILCodeVersion.GetNativeCodeVersions(pMD);
for (NativeCodeVersionIterator iter = nativeCodeVersions.Begin(); iter != nativeCodeVersions.End(); iter++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you actually want to run CopyNativeCodeVersiontoReJitData on every element or do you only want to run it once? I read it as once based on the comment below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, its not consistent with the comment so I'll fix that. Just for info I would still expect that anyone using SOS with tiered jitting enabled isn't going to get a nice experience just yet. Issue #12612 is tracking some work to get SOS in better shape for this.


In reply to: 127311445 [](ancestors = 127311445)

{
continue;
}

CopyReJitInfoToReJitData(pRejitInfo, &rgRevertedRejitData[iRejitDataReverted]);
NativeCodeVersionCollection nativeCodeVersions = ilCodeVersion.GetNativeCodeVersions(pMD);
for (NativeCodeVersionIterator iter = nativeCodeVersions.Begin(); iter != nativeCodeVersions.End(); iter++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, needs same fix.


In reply to: 127311558 [](ancestors = 127311558)

src/inc/shash.h Outdated
@@ -479,6 +498,10 @@ class SHash : public TRAITS
return *(const Iterator*)this;
}

KeyIterator()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on these changes? I'm a little suspicious of them because it looks like they allow these structures to not care about the SHash that they're operating on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, I believe I no longer have a dependency on the edits in this file and I can back them out. If I remember right I did something like this in an earlier version of the PR that made a default constuctor useful:
KeyIterator foo;
if(x) { foo = bar } else {foo = baz}.


In reply to: 127312711 [](ancestors = 127312711)

@@ -175,6 +177,7 @@ if (CLR_CMAKE_PLATFORM_UNIX OR CLR_CMAKE_TARGET_ARCH_ARM64)
endif ()
add_definitions(-DFEATURE_SVR_GC)
add_definitions(-DFEATURE_SYMDIFF)
add_definitions(-DFEATURE_TIERED_COMPILATION)
Copy link
Member

@brianrob brianrob Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter that FEATURE_REJIT is conditionally defined, but FEATURE_TIERED_COMPILATION is unconditionally defined? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this one is by design. ReJit depends on jumpstamps and jumpstamps aren't available on ARM. Tiered compilation on the other hand solely uses Precode so it should work on all architectures.


In reply to: 127331986 [](ancestors = 127331986)

@@ -8151,7 +8150,7 @@ void AppDomain::Exit(BOOL fRunFinalizers, BOOL fAsyncExit)
LOG((LF_APPDOMAIN | LF_CORDB, LL_INFO10, "AppDomain::Domain [%d] %#08x %ls is exited.\n",
GetId().m_dwId, this, GetFriendlyNameForLogging()));

ReJitManager::OnAppDomainExit(this);
CodeVersionManager::OnAppDomainExit(this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this call be after ETW::LoaderLog::DomainUnload to allow for rundown to access the data, or does it not matter, because rundown should only ever touch the active code? I would expect rundown to only touch active code because you could imagine address overlap if we emitted information about every version, and that would break symbol resolution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching the order does seem conceptually correct - I'll do that. In practice right now it won't matter because none of the code versions are being cleaned up. Historically for ReJit on desktop this looks like it should have been a legitimate issue. I'm guessing the overlap between ETW collection and ReJIT usage was small enough that nobody noticed but it isn't clear.


In reply to: 127332812 [](ancestors = 127332812)

@brianrob
Copy link
Member

brianrob commented Jul 13, 2017

TieredCompilationManager * GetTieredCompilationManager()

It looks like there's a split between what members live on AppDomain and BaseDomain. Would it make sense to pick just one, or is there a reason to split? #Resolved


Refers to: src/vm/appdomain.hpp:3831 in cc2c79e. [](commit_id = cc2c79e, deletion_comment = False)

@noahfalk
Copy link
Member Author

TieredCompilationManager * GetTieredCompilationManager()

That split is by design (I think, I haven't historically done a lot of work that dealt with domain distinctions so its possible I've misunderstood some semantics). Code versions (CodeVersionManager) are associated with whatever domain owns their module which could be the shared domain. Similarly call counts are associated with whatever domain owns their code. The shared domain derives from BaseDomain but not AppDomain. However the tiered compilation manager jits code on a thread affinitized to an AppDomain. Threads can't affinitize to the shared domain. If a thread in AppDomain X calls method Foo and triggers the final call on the call counter then we use the TieredCompilationManager in domain X to jit, even though the code being jitted might belong to the shared domain.


In reply to: 315205327 [](ancestors = 315205327)


Refers to: src/vm/appdomain.hpp:3831 in cc2c79e. [](commit_id = cc2c79e, deletion_comment = False)

}

const char *description = "jit lock";
INDEBUG(description = m_pszDebugMethodName;)
ListLockEntryHolder pEntry(ListLockEntry::Find(pJitLock, this, description));
ReleaseHolder<JitListLockEntry> pEntry(JitListLockEntry::Find(
pJitLock, pConfig->GetCodeVersion(), description));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this entry is now version specific, is it possible that we have a race between two threads attempting to compile the same IL but with different target versions (e.g. different optimization levels), and if so, is it possible that a higher optimization level might lose out to a lower opt-level?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entry being version specific means that those two threads aren't competing for the same lock. Each of them will run concurrently holding a lock that is specific to their version of the code.

CONTRACTL
{
NOTHROW;
GC_NOTRIGGER;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Tab level.

@@ -9354,11 +9353,8 @@ void MethodTable::SetSlot(UINT32 slotNumber, PCODE slotCode)
if (fSharedVtableChunk)
{
MethodDesc* pMD = GetMethodDescForSlotAddress(slotCode);
#ifndef FEATURE_INTERPRETER
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is there a reason to remove FEATURE_INTERPRETER in a number of places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FEATURE_INTERPRETER used to have a custom scheme for switching between the interpreted code and jitted code but as part of this work I refactored it to use the same code path that tiered jitting uses. One goal was to reduce the number of different ways we do the same thing, the other was to remove special cases from the code I needed to refactor in prestub.

LOG((LF_CLASSLOADER, LL_INFO1000000,
" In PrepareILBasedCode, calling JitCompileCode\n"));
// Mark the code as hot in case the method ends up in the native image
g_IBCLogger.LogMethodCodeAccess(this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional to only call this when pCode == NULL? I take it that we're not worried about logging the access here unconditionally because in precompiled code, the probes will get executed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that is intentional. You can see the corresponding old code lines at 1465, 1525, and 1535. I assume what you describe is the case but mostly I was just trying to be faithful to the original logic which also had the pCode == NULL check.

@brianrob
Copy link
Member

@noahfalk, thanks for all the information. Overall, I think this looks good. Based on the code and our conversations about it, the architecture sounds reasonable, and from my read of the code, looks correct. My main concern is around testing, simply because there's a good amount of churn in the VM portions of method compilation and code loading. I think that your plan to run this in the stress runs is a good idea. Furthermore, I'd recommend doing some A/B testing with and without this change on at least one scenario (you may have done this already) that uses fragile NGEN, R2R and jit compilation, to make sure that this doesn't regress the decision making on which code gets used (e.g. we don't end up JIT compiling code that used to be pre-compiled).

Consider me signed-off.

@noahfalk noahfalk force-pushed the tiered_jitting_m2 branch from 37531a5 to e7862b6 Compare July 23, 2017 22:11
This makes tiered compilation work properly with profiler ReJIT, and positions the runtime to integrate other versioning related features together in the future. See the newly added code-versioning design-doc in this commit for more information.

Breaking changes for profilers: See code-versioning-profiler-breaking-changes.md for more details.
@noahfalk noahfalk force-pushed the tiered_jitting_m2 branch from 657148d to fd19989 Compare July 25, 2017 00:39
@noahfalk
Copy link
Member Author

Commit history was squashed in preparation for checkin. The original history remains available in the tiered_jitting_m2_backup branch of my personal github repo for anyone who wants to see it.

@noahfalk
Copy link
Member Author

@dotnet-bot test Windows_NT x64 Release Priority 1 Build and Test

@noahfalk
Copy link
Member Author

@dotnet-bot test Tizen armel Cross Release Build

@noahfalk
Copy link
Member Author

While running through the final testing I encountered more issue (#13019) but I will check it in separately. That issue only affects people experimenting with tiered jitting turned on (so effectively only me).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants