Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initializing RuntimeMetrics #105845

Closed
noahfalk opened this issue Aug 1, 2024 · 4 comments · Fixed by #106014
Closed

Initializing RuntimeMetrics #105845

noahfalk opened this issue Aug 1, 2024 · 4 comments · Fixed by #106014
Assignees
Labels
area-System.Diagnostics.Metric in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@noahfalk
Copy link
Member

noahfalk commented Aug 1, 2024

Description

We recently added RuntimeMetrics, part of our long-running plan to offer all runtime metrics using OpenTelemetry standardized semantic conventions + the new Meter API added in .NET 6. For most metrics it is expected the Meter gets created either because the developer directly invoked new Meter(), or they invoked some higher level API that indirectly creates the Meter on their behalf. However we expect RuntimeMetrics to be available in all .NET apps by default without the developer invoking any explicit API to match the existing behavior available with runtime EventCounters. In particular we expect this repro scenario to work:

Repro

  1. Create a new console app using dotnet new console and modify it so that it doesn't exit immediately like this:
Console.WriteLine("Hello, World!");
Console.ReadLine();
  1. dotnet run
  2. In a separate console run dotnet-counters monitor -n <name_of_app>

Expected behavior
dotnet-counters should display the new metrics:

Press p to pause, r to resume, q to quit.
    Status: Running

[System.Runtime]
    dotnet.assembly.count ({assembly})                                    14
    dotnet.gc.collections ({collection} / 1 sec)
        gc.heap.generation=gen0                                            0
        gc.heap.generation=gen1                                            0
        gc.heap.generation=gen2                                            0
  ...

Actual behavior
dotnet-counters shows the pre-existing EventCounters

Press p to pause, r to resume, q to quit.
    Status: Running

[System.Runtime]
    % Time in GC since last GC (%)                                         0
    Allocation Rate (B / 1 sec)                                            0
    CPU Usage (%)                                                          0
    Exception Count (Count / 1 sec)                                        0
    GC Committed Bytes (MB)                                                0
    GC Fragmentation (%)                                                   0
    GC Heap Size (MB)                                                      0.938
    ...

Cause

In order to create the RuntimeMetrics automatically the current implementation initializes the Meter in response to any MeterListener being created. The premise was that it only matters if the Meter exists when something is capable of listening to it. For dotnet-counters we expect MetricsEventSource to create a MeterListener whenever dotnet-counters starts a new monitoring session.
OnEventCommand triggers the lazy creation of the CommandHandler, which creates AggregationManager, which creates MeterListener. Unforetunately we overlooked that MetricsEventSource itself isn't created unless a Meter exists. This made a chicken-egg situation where RuntimeMetrics Meter depends on MetricsEventSource being created first, but MetricsEventSource is depending on some Meter, such as RuntimeMetrics being created first. Many apps would have code that create some other Meter which breaks the cycle and causes everything to get bootstrapped but a simple console app doesn't do that.

Proposed solution

We need some code in the initialization path of .NET Core apps to ensure that MetricsEventSource is created, similar to what we already do for RuntimeEventSource. Since loading MetricsEventSource will cause another assembly to load (System.Diagnostics.DiagnosticSource) this has some startup performance overhead. We planned to mitigate this overhead by checking both EventSource.IsSupported and the AppContext switch System.Diagnostics.Metrics.Meter.IsSupported before doing the load. This would let form factors that are particularly size or startup cost sensitive to opt out of this as they already do for other EventSource/Meter related overhead. For apps that didn't opt-out we would then use reflection to get a delegate for some well-known non-public method in DiagnosticSource.dll, similar to the approach we use for StackTraceSymbols.
I see that currently CoreCLR, NativeAOT, and Mono all use slightly different approaches for getting RuntimeEventSource initialized (Mono, NativeAOT, CoreCLR). I assume we'd also do a little bit of refactoring to create some common EventSource initialization method that handles both RuntimeEventSource and MetricsEventSource.

Questions/Feedback

@brianrob @jkotas @MichalStrehovsky @LakshanF @tarekgh

  • Are you aware of scenarios where we expect app developers want to leave Meters+EventSource turned on but the cost of the early assembly load may still be problematic?
  • Any specific startup perf or size scenarios we should be measuring to determine if the overhead is acceptable?
  • Does UnsafeAccessor work to load a type in a not yet loaded assembly? I know we added that capability but it wasn't clear if it was applicable here vs. Type.GetType().
  • Any other thoughts/concerns/recommendations on the approach?
    Thanks!
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 1, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 1, 2024
@tarekgh tarekgh added this to the 9.0.0 milestone Aug 1, 2024
@tarekgh tarekgh added area-System.Diagnostics.Metric and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Aug 1, 2024
@jkotas
Copy link
Member

jkotas commented Aug 1, 2024

Does UnsafeAccessor work to load a type in a not yet loaded assembly?

It does not. You have to use regular reflection (Type.GetType).

@jkotas
Copy link
Member

jkotas commented Aug 1, 2024

known non-public method in DiagnosticSource.dll, similar to the approach we use for StackTraceSymbols.

StackTraceSymbols helper only ever ships with the runtime. It is not OOB. It makes the private method more acceptable - there are no concerns about version mixing and matching.

DiagnosticSource is OOB. You need to worry about versions mixing and matching (e.g. .NET 9 runtime using .NET 10 version of DiagnosticSource). We tend to avoid hidden contracts for OOB. You can certainly do it, but our infrastructure won't detect when you make a mistake and remove the method by accident.

@tarekgh
Copy link
Member

tarekgh commented Aug 2, 2024

Can we use any public type in DiagnosticSource (like Activity) for binding from corelib? and then having DiagnsoticSource uses ModuleInitializer to force initializing the MetricEventSource when loading?

noahfalk added a commit to noahfalk/runtime that referenced this issue Aug 6, 2024
Fixes dotnet#105845

Previously MetricsEventSource wasn't being created for apps that didn't ever create a Meter. This caused a chicken-and-egg problem for RuntimeMetrics which weren't created until MetricsEventSource started a tracing session. This change ensures that MetricsEventSource will be created on demand if ETW, EventPipe, or EventListener based tooling starts a tracing session. I took some extra effort to create the EventSource in a deferred fashion to avoid eager loading System.Diagnostics.DiagnosticSource.dll when it might never be needed.

Aside from the fix there were some small improvements:
- Moved NativeRuntimeEventSource to initialize in the same place as other startup EventSources
- Removed a useless lock(EventListener.EventListenersLock) around EventPipe eventProvider registration
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Aug 6, 2024
noahfalk added a commit to noahfalk/runtime that referenced this issue Aug 7, 2024
Fixes dotnet#105845

Previously MetricsEventSource wasn't being created for apps that didn't ever create a Meter. This caused a chicken-and-egg problem for RuntimeMetrics which weren't created until MetricsEventSource started a tracing session. This change ensures that MetricsEventSource will be created on demand if ETW, EventPipe, or EventListener based tooling starts a tracing session. I took some extra effort to create the EventSource in a deferred fashion to avoid eager loading System.Diagnostics.DiagnosticSource.dll when it might never be needed.

Aside from the fix there were some small improvements:
- Moved NativeRuntimeEventSource to initialize in the same place as other startup EventSources
- Removed a useless lock(EventListener.EventListenersLock) around EventPipe eventProvider registration
@noahfalk
Copy link
Member Author

noahfalk commented Aug 7, 2024

I have a PR ready for review to resolve this.

Can we use any public type in DiagnosticSource (like Activity) for binding from corelib? and then having DiagnsoticSource uses ModuleInitializer to force initializing the MetricEventSource when loading?

I didn't go that route because the deferred loading mechanism I created requires that I can get a reference to the EventSource being created. Using a DiagnosticSource ModuleInitializer would get MetricsEventSource created, but wouldn't make the reference to it available. If you think there is a better option I'm still glad to hear it - we should probably move discussion about implementation over to the PR though.

@github-actions github-actions bot locked and limited conversation to collaborators Sep 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Diagnostics.Metric in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants