Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove public provider from rundown session #91383

Merged
merged 10 commits into from
Sep 13, 2023

Conversation

davmason
Copy link
Member

Fixes #90575

When we start rundown we set the level/keywords on the public and the rundown provider:

ep_provider_config_init (&rundown_providers [0], ep_config_get_public_provider_name_utf8 (), keywords, verbose_logging_level, NULL); // Public provider.
ep_provider_config_init (&rundown_providers [1], ep_config_get_rundown_provider_name_utf8 (), keywords, verbose_logging_level, NULL); // Rundown provider.
// Update provider list with rundown configuration.
for (uint32_t i = 0; i < rundown_providers_len; ++i) {
const EventPipeProviderConfiguration *config = &rundown_providers [i];
EventPipeSessionProvider *session_provider = ep_session_provider_alloc (
ep_provider_config_get_provider_name (config),
ep_provider_config_get_keywords (config),
ep_provider_config_get_logging_level (config),
ep_provider_config_get_filter_data (config));
ep_raise_error_if_nok (ep_session_add_session_provider (session, session_provider));
}

If the user has a different set of events enabled for the public provider this can introduce unwanted events in the trace - i.e. GC events in a trace that specifically excludes them.

I tested that a CPU trace still symbolicates code properly in perfview and VS, if there are other scenarios people think of please let me know.

@davmason davmason added this to the 9.0.0 milestone Aug 31, 2023
@davmason davmason requested review from brianrob, lateralusX and a team August 31, 2023 09:28
@davmason davmason self-assigned this Aug 31, 2023
Copy link
Member

@lateralusX lateralusX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we just remove adding the public provider to rundown? I seen other strange artifacts due to this in the provider callback that would be eliminated if we didn't add the public provider during rundown. I looked through Mono and nothing in its rundown implementation uses events outside of the rundown provider, I assume the same applies to CoreCLR/NativeAOT.

@brianrob
Copy link
Member

Couldn't we just remove adding the public provider to rundown? I seen other strange artifacts due to this in the provider callback that would be eliminated if we didn't add the public provider during rundown. I looked through Mono and nothing in its rundown implementation uses events outside of the rundown provider, I assume the same applies to CoreCLR/NativeAOT.

I am wondering about this as well. Is the reason that these events show up because the hardcoded rundown configuration enables the public provider at verbose level? If so, then I'm thinking that removing the public provider is probably the right answer.

@davmason
Copy link
Member Author

When Noah and I were chatting about this we couldn't be sure if any scenario relied on the public provider during rundown. How confident are we that we won't break any existing scenario by removing the public provider?

@brianrob
Copy link
Member

brianrob commented Sep 1, 2023

When Noah and I were chatting about this we couldn't be sure if any scenario relied on the public provider during rundown. How confident are we that we won't break any existing scenario by removing the public provider?

Reasonably confident, but perhaps this is a good opportunity to consider making rundown configurable? I'm not sure how much work that is though.

@lateralusX
Copy link
Member

lateralusX commented Sep 1, 2023

When Noah and I were chatting about this we couldn't be sure if any scenario relied on the public provider during rundown. How confident are we that we won't break any existing scenario by removing the public provider?

I looked through Mono's rundown implementation and the events we emit are only from the rundown provider. Maybe we could do similar check on CoreCLR, ETW::EnumerationLog::EndRundown (). A brief look indicates that it is mainly using the rundown provider to decide what different events to emit. The events that it emits all seems to be DC kind of events, there is one exception checking the private provider:

BOOL bIsRichDebugInfoEnabled =
    ETW_EVENT_ENABLED(MICROSOFT_WINDOWS_DOTNETRUNTIME_PRIVATE_PROVIDER_DOTNET_Context, JittedMethodRichDebugInfo);

but since the private provider has not been part of rundown, this is either used in some different scenario (maybe ETW) or not working.

Maybe we could use a variation of current fix and do some validation checks on CI, checking that written events into a session that is in rundown mode and validate that events written from rundown thread only comes from the rundown provider, if not, log and abort the process so we can track it on CI?

@davmason
Copy link
Member Author

davmason commented Sep 1, 2023

Maybe we could use a variation of current fix and do some validation checks on CI, checking that written events into a session that is in rundown mode and validate that events written from rundown thread only comes from the rundown provider, if not, log and abort the process so we can track it on CI?

Great idea! I'll give it a shot

davmason and others added 2 commits September 1, 2023 09:05
Co-authored-by: Aleksey Kliger (λgeek) <akliger@gmail.com>
Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filtering seemed fine, the error checking I'm skeptical on.

src/native/eventpipe/ep-session.c Outdated Show resolved Hide resolved
src/coreclr/nativeaot/Runtime/eventpipe/ep-rt-types-aot.h Outdated Show resolved Hide resolved
@davmason davmason changed the title Prevent other threads from writing to a session once rundown begins Remove public provider from rundown session Sep 8, 2023
@davmason
Copy link
Member Author

davmason commented Sep 8, 2023

After running the test CI build and local testing I flipped this to removing the public provider from rundown and changed the title to represent that.

Copy link
Member

@lateralusX lateralusX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@davmason davmason merged commit ce0af21 into dotnet:main Sep 13, 2023
154 checks passed
@davmason
Copy link
Member Author

/backport to release/8.0

@github-actions
Copy link
Contributor

Started backporting to release/8.0: https://github.com/dotnet/runtime/actions/runs/6182464190

@ghost ghost locked as resolved and limited conversation to collaborators Oct 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EventPipe logs unintended events during rundown
5 participants