-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Prometheus Remote Write Exporter supporting Cortex - conversion and export for scalar OTLP metrics #1577
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this wait 3-4 days? Want to first move the repo to the new protos
Yeah if its what make sense to do. I will close this for now |
Don't have to close. I can still review but will merge later |
Yeah that would be great! I will tidy it up and reopen it tomorrow. Thanks. |
Apologies for the large PR. I want to make sure tests are together with corresponding implementations. The new implementation is ~200 lines, and the rest are tests. |
Codecov Report
@@ Coverage Diff @@
## master #1577 +/- ##
==========================================
- Coverage 92.25% 92.21% -0.04%
==========================================
Files 262 262
Lines 18667 18765 +98
==========================================
+ Hits 17221 17305 +84
- Misses 1037 1044 +7
- Partials 409 416 +7
Continue to review full report at Codecov.
|
73f43af
to
7418ab6
Compare
Hi @bogdandrutu Great to see your real-time changes for OTLP v2. I have a concern about the timing of when these code changes for the Collector and interface to exporter will be stable. Our interns @huyan0 @ercl are completing their internships by first week of September. We would like to make the necessary code changes and tests in the Prometheus remote write exporter we're developing before we can file PRs to handle OTLP v2 changes. Is there a design doc or GitHub tag tracking all design changes you're making in the Collector for handling the updated OTLP definition? We're looking at opentelemetry-proto to understand the latest definition of the OTLP. Thanks again for making these changes to enable OTLP to be the default data definition for the Collector. |
@alolita What's OTLP v2? I thought we are still on v1? |
I think it is referring to the new OTLP proto that Bogdan is moving this repository to. Would the new proto be associated with a version? |
3449935
to
8976d44
Compare
Hello @bogdandrutu, I noticed there were some updates made to OTLP in proto. Should I start refactoring my code according to those changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bogdandrutu I vote to merge this.
return err | ||
} | ||
|
||
if httpResp.StatusCode/100 != 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I noticed the OTel-Go exporter tests for 200 (http.StatusOK
) exactly. Do you think the other values really matter? The other value I'd maybe expect is 202. Do you think the other 200 codes matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about other code. The reference here is the Prometheus remote write implementation:https://github.com/prometheus/prometheus/blob/2c4a4548a8382f7c8966dbfda37f34c43151e316/storage/remote/client.go#L187
Hi @bogdandrutu @jmacd Requesting your review for @huyan0 's PR since his internship ends in 7 days. Please merge this PR as soon as possible. Also please provide an update on whether we should complete OTLP v1 support or wait for your code for OTLP v2 if you think it will land this week. 🙏 |
8976d44
to
89a6e31
Compare
Per discussion with @bogdandrutu and @markcartertm in Collector SIG on August 26, the OTLP metrics definition is still being discussed, thus is doesn't make perfect sense to review this PR now. If no further progress on OTLP is made early next week, this PR will be reviewed. If not, I might not be able to work on this PR due to internship timeline. |
Timestamp: convertTimeStamp(pt.TimeUnixNano), | ||
} | ||
|
||
addSample(tsMap, sample, labels, metric.GetMetricDescriptor().GetType()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not see that addSample treats "MONOTONIC_" as counters. Am I missing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am treating MONOTONI_
and non-monotonic the same way and assuming current value is already calculated. Is this what you are referring to? The only other different thing I did for MONOTONIC_
is to add a _total
suffix in its name. https://github.com/open-o11y/opentelemetry-collector/blob/166624cad9f682d2982cea442f5b10e5e1b92ed2/exporter/prometheusremotewriteexporter/helper.go#L178
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Prometheus remote write protocol doesn't make this distinction. (I'm not sure how this is handled in a complete system.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://www.robustperception.io/how-does-a-prometheus-counter-work. Here is a article explaining how Prometheus server handles a counter value reset. Counters are allowed to be reset to 0, so there's no need to do anything special here to handle it.
Timestamp: convertTimeStamp(pt.TimeUnixNano), | ||
} | ||
|
||
addSample(tsMap, sample, labels, metric.GetMetricDescriptor().GetType()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, you seem to ignore the MONOTONIC part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as #1577 (comment)
@huyan0 started to review, if the proto changes land first you will rebase otherwise will refactor after. |
@jmacd reviewing and provided feedback (hopefully useful feedback). |
This looks like it could / should merge. Let's discuss in today's Metric SIG meeting. |
0e302f9
to
24f4db3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to replace all the imports with dataold instead of consumer/pdata or internal/data
…te Write Exporter add prometheus remote write exporter address lint issues improve test coverage fix lint error add prometheus remote write exporter to default components change metric name label add conversion from ns to ms rename tests add attribute to label functionality in exporter.go format code resolve conflicts with master switch to fmt.Errorf add conversion from Int64 and Double OTLP metrics add prometheus remote write exporter address lint issues improve test coverage fix lint error add prometheus remote write exporter to default components change metric name label add conversion from ns to ms rename tests add attribute to label functionality in exporter.go format code resolve conflicts with master add check for nil change data format to dataold
24f4db3
to
d1db707
Compare
I have changed:
Thanks. |
* Fix Windows build of Jaeger tests The Jaeger tests use the low-level syscall package. The Windows specific function called in that package has a different function signature than the unix version. Add a windows specific file using the build flags to isolate this OS specific functionality. * Add changes to changelog * Blind succeed to account for unimplemented functionality on Windows
…y#1577) Bumps [boto3](https://github.com/boto/boto3) from 1.23.2 to 1.23.3. - [Release notes](https://github.com/boto/boto3/releases) - [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst) - [Commits](boto/boto3@1.23.2...1.23.3) --- updated-dependencies: - dependency-name: boto3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Description: This PR adds implementation for conversion from Int64, Monotonic Int64, Double, and Monotonic Double metrics to Prometheus TimeSeries, as well as the implementation for sending HTTP request to backend.
cc: @huyan0 @alolita @jmacd @bogdandrutu @annanay25 @gouthamve @open-telemetry/collector-approvers @open-telemetry/collector-maintainers