Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Telemetry : CPU and Memory Usage tracking #4818

Closed
Tracked by #4874
sourabh1007 opened this issue Oct 18, 2024 · 2 comments
Closed
Tracked by #4874

Open Telemetry : CPU and Memory Usage tracking #4818

sourabh1007 opened this issue Oct 18, 2024 · 2 comments

Comments

@sourabh1007
Copy link
Contributor

What do we want to collect? (as Implemented in java SDK)

Metric Name Unit Metric Type Description
cosmos.client.system.avgCpuLoad Percent 95th, 99th + histogram SDK measures avg. system-wide CPU every 10 seconds. This meter captures the 5-second avg. CPU usage measurements.
cosmos.client.system.freeMemoryAvailable MB None SDK measures free memory available for the process in MB every 10 seconds. This meter captures the 5-second measurements.

Available Open Telemetry Compatible Packages

NuGet Gallery | OpenTelemetry.Instrumentation.Runtime 1.9.0
Usage: https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/examples/runtime-instrumentation/Program.cs
Metrics List: https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/src/OpenTelemetry.Instrumentation.Runtime/README.md

NuGet Gallery | OpenTelemetry.Instrumentation.Process 0.5.0-beta.6
Usage: https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/examples/process-instrumentation/Program.cs
Metrics List: https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/src/OpenTelemetry.Instrumentation.Process/README.md#step-2-enable-process-instrumentation

.NET extensions metrics - .NET | Microsoft Learn
Metrics List: https://learn.microsoft.com/en-us/dotnet/core/diagnostics/built-in-metrics-diagnostics#microsoftextensionsdiagnosticshealthchecks

In-Built Metrics: https://learn.microsoft.com/en-us/dotnet/core/diagnostics/built-in-metrics-runtime

What we need in Cosmos DB SDK?

We have observed that brief CPU spikes in the past have negatively impacted the customer experience. While existing libraries allow us to capture CPU usage at intervals, such as every minute (depending on the capabilities of the exporter), we require more granular data on CPU and memory usage.

Proposal: Enhance the SDK by introducing custom CPU and memory usage metrics. These metrics will collect and record data every 10 seconds, generating a histogram of the values, as outlined above.

@lmolkova
Copy link
Member

lmolkova commented Oct 18, 2024

It's an anti-pattern to emit runtime metrics in client-specific instrumentations. .NET 9 will have a bunch of native metrics https://github.com/open-telemetry/semantic-conventions/blob/main/docs/runtime/dotnet-metrics.md that cover these and many other things.

The interval at which metrics are collected is configured by users, not instrumentations - https://github.com/open-telemetry/opentelemetry-dotnet/blob/0343715f49ac8e121ec39acd92f8d5572b3d036d/src/OpenTelemetry/Metrics/Reader/PeriodicExportingMetricReaderOptions.cs#L47.

Cosmos measuring things more frequently will result in aggregation across the user-configured interval - https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#metricreader-operations

@sourabh1007
Copy link
Contributor Author

After discussing with the team, we've decided to proceed without implementing custom CPU and memory metrics in the SDK, and will rely on the metrics provided by the .NET libraries. We can revisit this decision in the future if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

2 participants