-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spinning in GC: Consider switching to using YieldProcessorNormalized() #8936
Comments
cc @Maoni0 |
I would like to use this in GC except it can't be a method off of the Thread class as we'd want to use it on our native threads too (like the Server GC threads). There's nothing specific about this that requires it to be on the Thread class. Could we move it off of the Thread and make it into a generic utility function? |
Sure I can move it out to a global function as part of https://github.com/dotnet/coreclr/issues/13984. Is there any specific place it needs to be for the GC to be able to use it? |
If you are going to move this to the finalizer thread, I'd imagine you just have a default value of 1 for s_yieldsPerNormalizedYield (and change the check in IsInitialized obviously), that way YieldProcessorNormalized can always be called, just wouldn't be called with an optimal value till init is done on newer chips. Then GC (or anything else) can always call YieldProcessNormalized, before you do the actual calucation for s_yieldsPerNormalizedYield and doesn't need to worry about when it's initialized. |
I was planning on using the current values on a Skylake processor for the defaults before it's initialized (1 yield == 1 normalized yield, 7 for the max per spin iteration iirc). And yes I'm also planning on removing EnsureYieldProcessorNormalizedInitialized and having YieldProcessorNormalized itself request initialization if not already done so, while using the defaults meanwhile. |
Maybe I'm missing something but I don't know why you would want YieldProcessorNormalized itself to request initialization at all - why incur a 10ms pause in the middle of something when you can have the 10ms done on the finalizer thread (most of the time) on a different CPU? I thought that's what dotnet/coreclr#13984 is about. The finalizer thread would just run the init and whenever it's finished the YieldProcessorNormalized calls will use the value set by the init. So the calls to YieldProcessorNormalized in first 10ms or so may not use the more optimal value but that's ok. After that they will. |
By requesting initialization I just meant to trigger the background initialization without waiting for it to complete, what you described matches what I was thinking |
Just to clarify - what I meant was to have YieldProcessorNormalized itself not request initialization at all; instead you just queue the work to the finalizer thread as part of the EE startup and whenever that's done YieldProcessorNormalized will start using the more optimized value computed. If this is the same as what you were thinking too that's great. |
Oh ok, no that's not what I was thinking. I guess that makes sense, with the GC using it, the initialization will pretty much have to happen at some point in early stages of the app, so there probably won't be much benefit from lazy-initializing based on use. |
Yep :-) |
@Maoni0 @kouvel I'm looking in to this now, it seems like the easiest way to do this would be to add a YieldProcessor method to the GCToEEInterface, then call YieldProcessorNormalized from within the EE. There would be some overhead, but it seems ok when the whole point is waiting. To switch in standalone GC without adding an interface method, we would have to duplicate all of the YieldProcessorNormalized code on the GC side, and I would prefer to avoid code duplication. Does that line up with what you two are thinking? |
yes please add a method on the interface for this. |
The initialization is done in the finalizer thread at the beginning. If you want to avoid the overhead it may be possible to have an interface call to get the resulting values from initialization somehow and just clone the Yield functions. |
Also FYI:
|
this Yield function? FORCEINLINE void YieldProcessorNormalized(const YieldProcessorNormalizationInfo &normalizationInfo)
{
LIMITED_METHOD_CONTRACT;
unsigned int n = normalizationInfo.yieldsPerNormalizedYield;
_ASSERTE(n != 0);
do
{
YieldProcessor();
} while (--n != 0);
} and you are suggesting the interface just returns g_yieldsPerNormalizedYield (and GC's yield function would just loop g_yieldsPerNormalizedYield times)? |
Yea that's what I was thinking. If there are spin loops with exponential back-off and if you want to use it there's another variant of YieldProcessorNormalized() for that which takes another piece of initialization data. |
I am pushing this to post 2.1 as we need to concentrate on getting LocalGC to work and this is not essential to have for 2.1 ZBB. |
this was fixed by dotnet/coreclr#21523 |
Consider switching to using Thread::YieldProcessorNormalized() instead of YieldProcessor() (after calling Thread::EnsureYieldProcessorNormalizedInitialized() before the spin loop)
The text was updated successfully, but these errors were encountered: