-
Notifications
You must be signed in to change notification settings - Fork 2.7k
added a lightweight GC profiling option #22866
Conversation
Would probably want to test this but for the time being these changes look good to me. |
@dotnet-bot test Windows_NT x64 Checked Innerloop Build and Test |
@Maoni0 I guess I didn't? I can't find it. One question on your change. To now determine if there is an STW GC, would we look at runtime suspension? |
ok...taking off [WIP]. found one thing during testing which was an assert in profiling code was fired because I now need to check for the GC event mask in eetoprofintefaceimpl.cpp.
all interested parties please feel free to CR |
@mjsabby GarbageCollectStarted is always during STW. GarbageCollectFinished will not be during STW for BGC. unfortunately GarbageCollectFinished has no args to indicate if it's a BGC. so yes you'd need to turn on the COR_PRF_MONITOR_SUSPENDS mask. that's not a problem right? |
Nopes, it isn’t. I just wanted to confirm and we can make a documentation note too. Thanks. |
Gonna apply your change and try it out to see if GetGenerationBounds is valid between RuntimeSuspendFinished and RuntimeResumeStarted |
@Maoni0 you might want to consider posting a comment about this change in https://github.com/dotnet/coreclr/issues/15136 as it's a central place that people working on .NET Profilers subscribe to. |
@mattwarren seems appropriate. I'll add it; thanks! |
@Maoni0 |
Adding @davmason I talked with @Maoni0 offline but one question for you @mjsabby - can you describe the problem these events are solving for you? Whenever possible I like to have canonical examples of what customers wanted to do with various APIs. A couple of things we'll need to note in the docs:
|
@ygc369 - There isn't a GC specific profiler example, however there are a few other profiler examples here: https://github.com/Microsoft/clr-samples/tree/master/ProfilingAPI Updating the profiling docs is still in progress. |
@noahfalk Knowing GC ranges helps with production monitoring. Often we will have some ObjectIDs, either because a set of machines has ObjectAllocated monitoring turned on, or we acquire them through other means. In our case our DI container provides that info, so knowing these sentinel objects moving from a generation to the other helps with analysis, because we can log that info in high level logs. Also this setting allows us to have these machines have Concurrent GC turned on which wasn't possible previously. https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getgenerationbounds-method says it can be called in GCStart and GCEnd, but not in between. What wording would you like to change? |
Do you have a mechanism to rediscover the ObjectID after a GC changes it?
Imagine your profiler got a series of events GCStart, GCStart, GCEnd, GCEnd. From the implementation side it is legal to call GetGenerationBounds() at the first GCEnd. However the doc makes it ambiguous if that is legal. It says don't call in between GCStart and GCEnd, and the first GCEnd is indeed between the outermost GCStart and GCEnd. I wanted to clarify in the doc that GCEnd is legal, even when nested like that. |
Yes, these objects periodically write events with their own addresses. Then some of these are also in a Dictionary which refers to them, although not sure if that mechanism is being used by us. Ahh yes, your documentation note does make it clearer. Sounds good to me. |
@Maoni0 would the perf impact still be low if we also added MovedReferences2 and SurvivingReferences2 to this option? |
it would certainly be much higher overhead. not as high as what you get right now with the old GC profiling mask. and you can still have BGC enabled but BGC would need an extra STW in order to give you this info and that will make BGC's STW pause way longer. |
Thanks. I'm wondering if it makes sense to add another mask that gives this info OR since we haven't shipped this version of the runtime that this option only gives you this info non-BGC. |
can you please explain how you use info you get with MovedReferences2 and SurvivingReferences2? |
It would be used to implement Object Tracking without a custom way to rediscover ObjectIDs after a GC. Its use case is again mainly approximate analytics and will therefore be used to beef up production monitoring. |
@Maoni0 |
so do you need Surviving refs at all or you will be fine with Moved refs only? if you just want to track Moved, it means we can cut down some overhead and not do anything for BGC. |
@ygc369 - You might be interested in this technique. If that isn't what you are looking for its probably better to create a new issue. |
Yes, surviving references are not needed. |
@noahfalk Yes, that's what I want. But unfortunately, @Maoni0 didn't paste her complete code in that artical, so I still don't know how to use. |
@ygc369 - I moved your issue to #24104 |
added a profiling event mask in the high 32-bit, COR_PRF_HIGH_BASIC_GC, for basic GC monitoring. it can be set via `ICorProfilerInfo5::SetEventMask2`. all this gives you is * GC start callback * GC end callback * update generational bounds note that one different behavior between this and the existing COR_PRF_MONITOR_GC is, aside from the obvious that it doesn't give you any info beyond the above, is that the GC end callback + update generational bounds are enabled for _all_ GCs, not just non concurrent GCs. I kept the behavior the same for COR_PRF_MONITOR_GC because I don't want to risk breaking existing profiling tools that do not anticipate these for concurrent GCs. Commit migrated from dotnet/coreclr@ea099fb
added a profiling event mask in the high 32-bit, COR_PRF_HIGH_BASIC_GC, for basic GC monitoring. it can be set via
ICorProfilerInfo5::SetEventMask2
. all this gives you isnote that one different behavior between this and the existing COR_PRF_MONITOR_GC is, aside from the obvious that it doesn't give you any info beyond the above, is that the GC end callback + update generational bounds are enabled for all GCs, not just non concurrent GCs. I kept the behavior the same for COR_PRF_MONITOR_GC because I don't want to risk breaking existing profiling tools that do not anticipate these for concurrent GCs.
I marked it as [WIP] because I have NOT tested it!
@sywhang @noahfalk
@mjsabby I thought there was an issue that you opened for this but I can't find it...do you know what happened to it?