Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 2 issues when triggering a garbage collection via EventPipe #102612

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 17 additions & 5 deletions src/coreclr/vm/eventtrace.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2403,18 +2403,30 @@ VOID EtwCallbackCommon(
// Special check for the runtime provider's ManagedHeapCollectKeyword. Profilers
// flick this to force a full GC.
if (g_fEEStarted && !g_fEEShutDown && bIsPublicTraceHandle &&
((MatchAnyKeyword & CLR_MANAGEDHEAPCOLLECT_KEYWORD) != 0))
#if !defined(HOST_UNIX)
(ControlCode == EVENT_CONTROL_CODE_ENABLE_PROVIDER) && // only when the provider is enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unforetunately checking the control code for ENABLE_PROVIDER is only going to give you a partial fix. ETW convention (which EventPipe also follows) is that if there is at least one session still active for a given provider then disable still invokes the callback with ControlCode=ENABLE_PROVIDER.

I think there are two things that would potentially work:

  1. For ETW we could bring back filtering for the CAPTURE_STATE+ENABLE control code. It looks like c16fd29 removed all the filtering on GC heap snapshots so after that they would have been triggered by any enable, disable or capture state. We could narrow it back down to what it was originally: enable and capture state only. Enable on ETW is still broader than ideal, but it preserves back compat and ETW profilers can always use CAPTURE_STATE instead if they want to do it precisely.

  2. For EventPipe, you can distinguish a session disable from a session enable based on whether the sourceId parameter is NULL (Enable=not-null,Disable=null). That parameter isn't currently exposed in this CallbackCommon function but it is present in EventPipeEtwCallbackDotNETRuntime so you could change the signature on this common method to pass it through. Something like:

enum SessionChange
{
    EventPipeSessionEnable,
    EventPipeSessionDisable,
    EtwSessionChangeUnknown
}

EventPipeEtwCallbackDotNETRuntime(sourceId, ...)
{
  SessionChange change = sourceId != null ? EventPipeSessionEnable : EventPipeSessionDisable;
  EtwCallbackCommon(change, ...);
}

EtwCallback(...)
{
   EtwCallbackCommon(EtwSessionChangeUnknown, ...);
}

#endif
((MatchAnyKeyword & CLR_MANAGEDHEAPCOLLECT_KEYWORD) != 0)
)
{
// Profilers may (optionally) specify extra data in the filter parameter
// to log with the GCStart event.
LONGLONG l64ClientSequenceNumber = 0;
#if !defined(HOST_UNIX)
PEVENT_FILTER_DESCRIPTOR FilterData = (PEVENT_FILTER_DESCRIPTOR)pFilterData;
if ((FilterData != NULL) &&
(FilterData->Type == 1) &&
(FilterData->Size == sizeof(l64ClientSequenceNumber)))
if ((FilterData != NULL) && (FilterData->Type == 0) && (FilterData->Size > 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to define a new convention to pass this data which works for EventPipe (thats fine), but breaks the previous convention that was used by ETW profilers. I'm fine to add a new EventPipe-friendly convention but the old one needs to keep working.

{
l64ClientSequenceNumber = *(LONGLONG *) (FilterData->Ptr);
// look for the value without checking the provided id
char* clientSequenceNumberAsString = (char*)FilterData->Ptr;
clientSequenceNumberAsString += strlen(clientSequenceNumberAsString) + 1;

// extract the client sequence number from the string
int result = _snscanf_s(clientSequenceNumberAsString, FilterData->Size - (strlen((char*)FilterData->Ptr) + 1), "%I64u", &l64ClientSequenceNumber);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to create a new convention based on keys and values I think it would be good to handle potentially multiple key-value pairs being specified at the same time of which 0,1, or multiple of those keys might have the matching key name. For example we could define the key name as 'GCSeqNumber' and then the parsing code would need to determine to be able to handle inputs like these:

Foo\0Bar\0GCSeqNumber\025\0  -> 25
Foo\0Bar\0Baz\025\0 -> not present, default to 0
Foo\0GCSeqNumber\0Baz\025\0 -> not present, default to 0
Baz\025\0GCSeqNumber\0NotNumber\0 -> invalid number text, default to 0
GCSeqNumber\025\0Baz\0Foo\0GCSeqNumber\019\0 -> 25 (first one wins?)

if (result != 1)
{
// in case of parsing failure, reset the sequence number to 0
l64ClientSequenceNumber = 0;
}
}
#endif // !defined(HOST_UNIX)
ETW::GCLog::ForceGC(l64ClientSequenceNumber);
Expand Down
1 change: 1 addition & 0 deletions src/native/eventpipe/ep.c
Original file line number Diff line number Diff line change
Expand Up @@ -673,6 +673,7 @@ disable_helper (EventPipeSessionID id)

while (ep_provider_callback_data_queue_try_dequeue (provider_callback_data_queue, &provider_callback_data)) {
ep_rt_prepare_provider_invoke_callback (&provider_callback_data);
provider_callback_data.enabled = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not modify enabled here. EventPipe is following a convention established by ETW. Modifying when we emit the PROVIDER_ENABLED control code makes ETW/EventPipe behave differently which makes it more difficult for anyone to write logic in a provider callback that works correctly for both.

provider_invoke_callback (&provider_callback_data);
ep_provider_callback_data_fini (&provider_callback_data);
}
Expand Down
Loading