Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OC stackdriver exporter to capture self observability metrics as GCM protos #282

Conversation

aabmass
Copy link
Contributor

@aabmass aabmass commented Jan 20, 2022

Collect the OC self observability metrics using the same MetricTestServer (GCM mock server) + stackdriver exporter instead of a bespoke proto format. They now get serialized as they would be exported to GCM.

@aabmass aabmass force-pushed the self-obs-oc-go-exporter branch 2 times, most recently from 70e9e70 to fb6c2eb Compare January 20, 2022 04:49
@@ -22,14 +22,10 @@ message MetricExpectFixture {
repeated google.monitoring.v3.CreateTimeSeriesRequest create_time_series_requests = 1;
repeated google.monitoring.v3.CreateMetricDescriptorRequest create_metric_descriptor_requests = 2;
repeated google.monitoring.v3.CreateTimeSeriesRequest create_service_time_series_requests = 3;
repeated SelfObservabilityMetric self_observability_metrics = 4;
SelfObservabilityMetric self_observability_metrics = 4;
Copy link
Contributor Author

@aabmass aabmass Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fields in SelfObservabilityMetric could be flattened into this message, but I feel like nesting the messages makes the JSON easier to read than a long name.

@aabmass aabmass force-pushed the self-obs-oc-go-exporter branch from fb6c2eb to d3c18d0 Compare January 20, 2022 05:04
@aabmass aabmass marked this pull request as ready for review January 20, 2022 05:18
@aabmass aabmass requested review from dashpole and a team January 20, 2022 14:53
Copy link
Contributor

@dashpole dashpole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does using the OpenCensus exporter help us? Won't the metrics still have inconsistent values?

@aabmass
Copy link
Contributor Author

aabmass commented Jan 20, 2022

Why does using the OpenCensus exporter help us? Won't the metrics still have inconsistent values?

Yes it doesn't fix that problem, but i was going down a rabbit hole of trying to add metric type and histogram bucket options to the SelfObservabilityMetric and figured it's easier to just use the GCM proto.

This is also better if/when we switch these self obs metrics to OTel instrumentation, we can just use our OTel exporter to capture the same metrics rather than write conversion to a bespoke protobuf format.

@dashpole
Copy link
Contributor

it's easier to just use the GCM proto.

I think i've almost connected the dots. How does using opencensus relate to using the GCM proto?

@aabmass
Copy link
Contributor Author

aabmass commented Jan 20, 2022

How does using opencensus relate to using the GCM proto?

Do you mean using the OpenCensus stackdriver exporter? It's just the way to serialize OC metrics to GCM protobufs. I may be misunderstanding your question

@dashpole
Copy link
Contributor

Ah, ok. I got it now. We are essentially changing from opencensus to GCM protos.

@aabmass
Copy link
Contributor Author

aabmass commented Jan 20, 2022

Ah, ok. I got it now. We are essentially changing from opencensus to GCM protos.

Yup. Probably should have explained that better, sorry

@aabmass aabmass merged commit db92df8 into GoogleCloudPlatform:col-exporter-rewrite Jan 20, 2022
@aabmass aabmass deleted the self-obs-oc-go-exporter branch January 20, 2022 20:33
dashpole added a commit that referenced this pull request Feb 2, 2022
* Skip all fixture tests (#239)

* Initial structure for new pdata metrics exporter (#238)

* [Metrics Rewrite] add outline with todos for fragmenting work (#240)

* [Metrics Rewrite] attribute to label mapping (#243)

[Metrics Rewrite] attribute to label mapping

* [Metrics Rewrite] support for pdata Sum points (#242)

* [Metrics Rewrite] support for pdata Sum points

* update breaking-changes.md

* use concatentation instead of sprintf

* [Metrics Rewrite] support for pdata Gauge points (#244)

* Add logic to translate metric descriptors and initial flow (#247)

* Fixes from merge.

* Fix tests.

* Clean up test cases, re-disable integration tests.

* Add summary descriptors and label descriptors.

* Fix lint issues.

* Some fixes from review.

* Remove metric import.

* Fixes from review.
- Update default config method
- Simplify some of my lack-of-go expertise.

* Add unit test for metric domains.

* Fixes from review.

* Add breaking changes.

* Fixes from review.

* Update context to be TODO.

* Add support for exponential histograms and exemplars. (#251)

* Add support for exponential histograms and exemplars.

* Fixes from review.

* Fixes from review.

* Fixes from discussion.

* [Metrics Rewrite] implement monitored resource mapping (#252)

* [Metrics Rewrite] implement monitored resource mapping

* review fixes

* [Metrics Rewrite] update breaking-changes.md for monitored resource (#255)

* Add summary mapping to exporter. (#249)

* Add config to call `CreateServiceTimeSeries` (#259)

* Initial implementation of create service time series.

* Add a test case for create service timeseries.

* Add logic to auto-detect project id if not configured.

* Fix from code review

* Fix resource to be one that has retention policy for integration tests.

* Add support for histogram to metrics exporter. (#258)

BUG=210164184

* Re-enable ops-agent self-metric integration test. (#260)

* [Metrics Rewrite] add ExponentialHistogram fixture (#257)

* [Metrics Rewrite] add ExponentialHistogram fixture

* make tests deterministic

* few last changes

* close channel instead of sending a message

* Enable ops agent host metric integration test. (#264)

- There is a bug in upstream agent-metric-processor that sets incorrect units on usage metrics (GoogleCloudPlatform/opentelemetry-operations-collector#72)
- We update the expectations for inculsion of units in CreateTimeSeries
- We disable metric descriptors (for now).  Given the bug in agent-metric-processor, liekly ops-agent will need upstream fix for this first.

* add a feature gate, which defaults to false, for using the re-written exporter (#267)

* Enable Basic integration tests (#266)

* Enable basic counter test.

* Enable delta counter metrics.

- Note: Delta counters are NOW fake-delta (i.e. cumulatives with limited time windows)

* Enable non-monotonic-sum integration test.

* Re-enable summary integration test and fix design issues in summary translation.

- Summary exports percentiles, not quantiles
- Percentiles should include similar double precision in the string.

* Fix recordfixtures script to use featuregate (#270)

* Skip already seen attribute keys when creating LabelDescriptors (#272)

* Reenable GKE metrics agent fixtures (#271)

* Update breaking-changes.md for googlecloudmonitoring/point_count self observability (#277)

* Move logging to use zap-logger and set up self-observability to match collector expectations. (#275)

* Enable metric prefix integraiton tests. (#274)

* enable workloadapis prefix integration test.

* update unknown domain metrics expect.

* Add instrumentationLibraryToLabels method to metrics exporter. (#253)

* Add instrumentationLibraryToLabels method to metrics exporter.

BUG=https://b.corp.google.com/issues/210164355

* Remove custom_metrics_domains behaviour from metrics-exporter.

* Remove dependency on go.opentelemetry.io/collector (#279)

* remove dependency on go.opentelemetry.io/collector

* add ocgrpc metrics to exporters' self-obs metrics (#280)

* Use OC stackdriver exporter to capture self observability metrics as GCM protos (#282)

* Capture ocgrpc self observability metrics (#283)

* make integrationtest not internal (#285)

* Remove internal/ prefix for integrationtest (#288)

* Add batching support to metrics-exporter. (#286)

* Add batching support to metrics-exporter.

* Retry when we fail to write metric descriptors.

* Re-enable workload metrics integration tests (#278)

* update header year for new files (#296)

* Document new CreateMetricDescriptor behavior (#294)

* reenable disabled metrics test (#299)

Co-authored-by: Aaron Abbott <aaronabbott@google.com>
Co-authored-by: Josh Suereth <Joshua.Suereth@gmail.com>
Co-authored-by: Thomas Barker <tbarker25@gmail.com>
Co-authored-by: Punya Biswal <punya@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants