Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdk] Hardcode known histograms using seconds #4820

Merged
merged 8 commits into from
Sep 5, 2023

Conversation

samlam
Copy link
Contributor

@samlam samlam commented Sep 1, 2023

Closes #4797
Design discussion issue #4797

Changes

  • Change histogram default boundaries when MetricStreamIdentity.Unit is set to "s"
  • New default buckets are set to { 0, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 }
  • Created a set of well-known Meters/Instruments for the default
  • Added unit-test for the new defaults

Acknowledgement

The PR is created with the assistance from @CodeBlanch

Merge requirement checklist

  • CONTRIBUTING guidelines followed (nullable enabled, static analysis, etc.)
  • Unit tests added/updated
  • Appropriate [CHANGELOG.md](http://changelog.md/) files updated for non-trivial changes
  • Changes in public API reviewed (if applicable)

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 1, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@samlam samlam changed the title Features/default buckets [sdk] Hardcode known histograms using seconds Sep 1, 2023
@samlam samlam marked this pull request as ready for review September 1, 2023 22:01
@samlam samlam requested a review from a team September 1, 2023 22:01
@codecov
Copy link

codecov bot commented Sep 3, 2023

Codecov Report

Merging #4820 (91bcbcf) into main (34544bd) will increase coverage by 0.05%.
Report is 1 commits behind head on main.
The diff coverage is 100.00%.

❗ Current head 91bcbcf differs from pull request most recent head 7d38c10. Consider uploading reports for the commit 7d38c10 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4820      +/-   ##
==========================================
+ Coverage   83.86%   83.92%   +0.05%     
==========================================
  Files         293      293              
  Lines       11977    11998      +21     
==========================================
+ Hits        10045    10069      +24     
+ Misses       1932     1929       -3     
Files Changed Coverage
src/OpenTelemetry/Metrics/AggregatorStore.cs 100.00%
src/OpenTelemetry/Metrics/Metric.cs 100.00%

[InlineData("System.Net.Http", "http.client.request.duration")]
[InlineData("System.Net.Http", "http.client.request.time_in_queue")]
[InlineData("System.Net.NameResolution", "dns.lookups.duration")]
public void HistogramBucketsDefaultUpdatesForSecondsTest(string meterName, string instrumentName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add another unit test to check that a regular Histogram with Unit as seconds still uses the default Histogram bounds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this resolved? I don't see another unit test added.


var metricStreamIdentity = new MetricStreamIdentity(instrument, metricStreamConfiguration: null);

AggregatorStore aggregatorStore = new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way of testing is relying a bit too much on the knowledge of the internal implementation. Could you use test the bounds using an InMemoryExporter instead? Get hold of the MetricPoint after the export and check the bounds using metricPoint.GetHistogramBuckets().ExplicitBounds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with InMemoryExporter, let me research it a bit. Thx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way of testing is relying a bit too much on the knowledge of the internal implementation.

This logic is strange to me. It's a unit test. We're changing how AggregatorStore resolves buckets. That's the "unit" we are testing. Logic downstream reliant on the resolved buckets isn't being changed. Other tests should be covering that. I would very much expect unit tests to cover internal details 😄

If we want a bigger integration test exercising more of the SDK, fine to do, but too much of that is going to result in slow CI with a ton of overlap. Just my $0.02 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the "unit" we are testing.

We haven't really followed the pedantic approach to unit testing in our repo. It's a spectrum. The tests in our repo are always somewhere between a strict by-the-book unit test and a blanket integration test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to add tests to cover that this special logic is only applicable if SDK is using defaults. i.e if a View is configured to override buckets, then that gets respected in all metrics, including the ones we are hardcoding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to add tests to cover that this special logic is only applicable if SDK is using defaults. i.e if a View is configured to override buckets, then that gets respected in all metrics, including the ones we are hardcoding.

And I recommend to the existing pattern of using InMemoryExporter to achieve this, though I fully agree its neither unit nor integration tests but something in between!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to add tests to cover that this special logic is only applicable if SDK is using defaults. i.e if a View is configured to override buckets, then that gets respected in all metrics, including the ones we are hardcoding.

Why is that needed? This code is only touching the default bounds which are used if there wasn't a view. We have view tests covering view bounds are selected over the defaults (whatever they may be).

@samlam You could try extending this test. But not sure how to best solve needing an exact meter name.

We haven't really followed the pedantic approach to unit testing in our repo. It's a spectrum. The tests in our repo are always somewhere between a strict by-the-book unit test and a blanket integration test.

Well we are the maintainers. If we ask people to add integration tests that's what we'll get 😄 And we'll have to maintain the extra code forever!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is only touching the default bounds which are used if there wasn't a view.

We know this to be true but it would be good to test that and verify that Views take precedence even for especially treated metrics.

("Microsoft.AspNetCore.RateLimiting", "aspnetcore.rate_limiting.request_lease.duration"),
("Microsoft.AspNetCore.Server.Kestrel", "kestrel.connection.duration"),
("Microsoft.AspNetCore.Server.Kestrel", "kestrel.tls_handshake.duration"),
("OpenTelemetry.Instrumentation.AspNetCore", "http.server.duration"),
Copy link
Member

@vishweshbankwar vishweshbankwar Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http.server.duration and http.client.duration should not be part of this list. Those will continue to follow default buckets with unit as ms. #4802.

This should be
("OpenTelemetry.Instrumentation.AspNetCore", "http.server.request.duration")
("OpenTelemetry.Instrumentation.HttpClient", "http.client.request.duration")

@@ -28,6 +28,22 @@ public sealed class Metric
internal const int DefaultExponentialHistogramMaxScale = 20;

internal static readonly double[] DefaultHistogramBounds = new double[] { 0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000 };
internal static readonly double[] DefaultHistogramBoundsSeconds = new double[] { 0, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 };
internal static readonly HashSet<(string, string)> DefaultHistogramBoundMappings = new()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Provide the capacity in the ctor

samlam and others added 3 commits September 5, 2023 12:33
Co-authored-by: Utkarsh Umesan Pillai <66651184+utpilla@users.noreply.github.com>
Co-authored-by: Utkarsh Umesan Pillai <66651184+utpilla@users.noreply.github.com>
Copy link
Contributor

@utpilla utpilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this as the changes made in the src/ look good to me and we would like to unblock 1.6.0 release.

The unresolved comments should be addressed in later PRs.

Copy link
Member

@CodeBlanch CodeBlanch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @samlam

("Microsoft.AspNetCore.Server.Kestrel", "kestrel.tls_handshake.duration"),
("OpenTelemetry.Instrumentation.AspNetCore", "http.server.request.duration"),
("OpenTelemetry.Instrumentation.Http", "http.client.request.duration"),
("System.Net.Http", "http.client.connection.duration"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to update the PR description with the links to the runtime code for where these metrics are defined.

@utpilla utpilla merged commit a0084f2 into open-telemetry:main Sep 5, 2023
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update default values for explicit histogram bucket boundaries
6 participants