Optimize the uses of ThreadStatic in class having static ctor #746
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In a class that has
static ctor
, accessing static fields includeThreadStatic
is challenging. We have a helper call to find the base address of the variable to access and generally this helper call is not optimizable. Meaning, if we are accessing such variable in a loop, the corresponding helper call cannot be move around to optimize code. The reason being, we need to be precise at what time the ctor will be called and there are rules that govern that. Because of lack of ability to move the helper call around, if we access a static variable inside a loop, we will be calling the helper call in every single iteration. Find the base address is time consuming as well because it has to go through at least 3 looks up tables (module, thread and the variable).However, in absence of
static ctor
, we can safely move around the helper call and also move it outside the loop in some cases. I noticed thatReserveEntry()
has such code and the code generated is sub-optimal. We are fixing it in .NET in dotnet/runtime#75785, but it will be in .NET 8.Till then, I was thinking of fixing it in this code base to not see those issues. For that, I have created a private class and moved most of the
ThreadStatic
variables inside it. Here is the code gen difference:Test1 code:
Test2 code:
If you see here, in
Test2
, the helper callCORINFO_HELP_GETSHARED_NONGCTHREADSTATIC_BASE
has been moved out of IG03 loop because it is being accessed using no ctor classX
, but inTest1
we will call the helper inside loop.