Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/transform] Add Conversion Function to OTTL for Exponential Histo --> Histogram #33824

Merged
merged 70 commits into from
Sep 21, 2024

Conversation

daidokoro
Copy link
Contributor

@daidokoro daidokoro commented Jul 1, 2024

Description

This PR adds a custom metric function to the transformprocessor to convert exponential histograms to explicit histograms.

Link to tracking issue: Resolves #33827

Function Name

convert_exponential_histogram_to_explicit_histogram

Arguments:

  • distribution (upper, midpoint, uniform, random)
  • ExplicitBoundaries: []float64

Usage example:

processors:
  transform:
    error_mode: propagate
    metric_statements:
    - context: metric
      statements:
        - convert_exponential_histogram_to_explicit_histogram("random", [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]) 

Converts:

Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-31 09:35:25.212037 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
Bucket (32.000000, 64.000000], Count: 10
Bucket (64.000000, 128.000000], Count: 22
Bucket (128.000000, 256.000000], Count: 12
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}

To:

Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: Histogram
     -> AggregationTemporality: Delta
HistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-30 21:37:07.830902 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
ExplicitBounds #0: 10.000000
ExplicitBounds #1: 20.000000
ExplicitBounds #2: 30.000000
ExplicitBounds #3: 40.000000
ExplicitBounds #4: 50.000000
ExplicitBounds #5: 60.000000
ExplicitBounds #6: 70.000000
ExplicitBounds #7: 80.000000
ExplicitBounds #8: 90.000000
ExplicitBounds #9: 100.000000
Buckets #0, Count: 0
Buckets #1, Count: 0
Buckets #2, Count: 0
Buckets #3, Count: 2
Buckets #4, Count: 5
Buckets #5, Count: 0
Buckets #6, Count: 3
Buckets #7, Count: 7
Buckets #8, Count: 2
Buckets #9, Count: 4
Buckets #10, Count: 21
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}

Testing

  • Several unit tests have been created. We have also tested by ingesting and converting exponential histograms from the statsdreceiver as well as directly via the otlpreceiver over grpc over several hours with a large amount of data.

  • We have clients that have been running this solution in production for a number of weeks.

Readme description:

convert_exponential_hist_to_explicit_hist

convert_exponential_hist_to_explicit_hist([ExplicitBounds])

the convert_exponential_hist_to_explicit_hist function converts an ExponentialHistogram to an Explicit (normal) Histogram.

ExplicitBounds is represents the list of bucket boundaries for the new histogram. This argument is required and cannot be empty.

WARNING:

The process of converting an ExponentialHistogram to an Explicit Histogram is not perfect and may result in a loss of precision. It is important to define an appropriate set of bucket boundaries to minimize this loss. For example, selecting Boundaries that are too high or too low may result histogram buckets that are too wide or too narrow, respectively.

@daidokoro daidokoro requested a review from a team July 1, 2024 11:41
@daidokoro daidokoro marked this pull request as draft July 1, 2024 11:42
@github-actions github-actions bot added the processor/transform Transform processor label Jul 1, 2024
@github-actions github-actions bot requested a review from kentquirk July 1, 2024 11:42
@daidokoro daidokoro changed the title [draft] Add Conversion Function to OTTL for Exponential Histo --> Histogram [processor/transform] Add Conversion Function to OTTL for Exponential Histo --> Histogram Jul 1, 2024
@daidokoro daidokoro marked this pull request as ready for review July 2, 2024 15:21
@TylerHelmuth TylerHelmuth added the ready to merge Code review completed; ready to merge by maintainers label Sep 16, 2024
@TylerHelmuth
Copy link
Member

@daidokoro looks like you need a go mod tidy and then we'll be good.

@TylerHelmuth TylerHelmuth removed the ready to merge Code review completed; ready to merge by maintainers label Sep 16, 2024
@daidokoro
Copy link
Contributor Author

@daidokoro looks like you need a go mod tidy and then we'll be good.

Thanks @TylerHelmuth , executed.

…onential_hist_to_explicit_hist_test.go

Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
daidokoro and others added 2 commits September 18, 2024 19:55
…onential_hist_to_explicit_hist_test.go

Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
@daidokoro daidokoro requested a review from a team as a code owner September 20, 2024 13:21
@daidokoro
Copy link
Contributor Author

@TylerHelmuth

Currently all failing tests don't seem to be related to this PR directly.

  • Link check fails, however no links are added by this PR
  • Lint fails due to lack of export data in a package that is unrelated to this change
  • Another lint job fails with a stream error unrelated to this package
  • Lastly the changelog check fails saying there have been changes to the changelog, however this PR does not make changes to that file.

Can you advise on these issues?

Thanks in advance

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Sep 20, 2024

@daidokoro this has been a bad week for our CI as there has been a lot of churning in Core that is causing us to do more core library updates than normal. In addition some new linters are causing issues and the prometheus compliance tests as well. Also the general Go proxy issue.

Things are getting cleaned up. One thing that would help is if you allow us to merge commits into your branch from this PR. That would allow me to update your branch with the latest from main and merge code suggestions I make.

@TylerHelmuth
Copy link
Member

Another thing to do is to run make checks locally. Make sure your local Go version is 1.22.7.

@daidokoro
Copy link
Contributor Author

@daidokoro this has been a bad week for our CI as there has been a lot of churning in Core that is causing us to do more core library updates than normal. In addition some new linters are causing issues and the prometheus compliance tests as well. Also the general Go proxy issue.

Things are getting cleaned up. One thing that would help is if you allow us to merge commits into your branch from this PR. That would allow me to update your branch with the latest from main and merge code suggestions I make.

Thanks @TylerHelmuth for the update. Feel free to make commits and I'll merge all suggestions. Anything else I can do to help let me know.

@TylerHelmuth
Copy link
Member

@daidokoro I don't have permission to merge in. I'd say get the latest from main (it should have all the CI fixed), and then ensure make checks

@TylerHelmuth
Copy link
Member

Needs a make crosslink

@TylerHelmuth TylerHelmuth merged commit 74b1048 into open-telemetry:main Sep 21, 2024
156 checks passed
@github-actions github-actions bot added this to the next release milestone Sep 21, 2024
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this pull request Oct 4, 2024
… Histo --> Histogram (open-telemetry#33824)

## Description

This PR adds a custom metric function to the transformprocessor to
convert exponential histograms to explicit histograms.

Link to tracking issue: Resolves open-telemetry#33827

**Function Name**
```
convert_exponential_histogram_to_explicit_histogram
```

**Arguments:**

- `distribution` (_upper, midpoint, uniform, random_)
- `ExplicitBoundaries: []float64`

**Usage example:**

```yaml
processors:
  transform:
    error_mode: propagate
    metric_statements:
    - context: metric
      statements:
        - convert_exponential_histogram_to_explicit_histogram("random", [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]) 
```

**Converts:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-31 09:35:25.212037 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
Bucket (32.000000, 64.000000], Count: 10
Bucket (64.000000, 128.000000], Count: 22
Bucket (128.000000, 256.000000], Count: 12
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

**To:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: Histogram
     -> AggregationTemporality: Delta
HistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-30 21:37:07.830902 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
ExplicitBounds #0: 10.000000
ExplicitBounds #1: 20.000000
ExplicitBounds #2: 30.000000
ExplicitBounds #3: 40.000000
ExplicitBounds #4: 50.000000
ExplicitBounds #5: 60.000000
ExplicitBounds #6: 70.000000
ExplicitBounds #7: 80.000000
ExplicitBounds #8: 90.000000
ExplicitBounds #9: 100.000000
Buckets #0, Count: 0
Buckets #1, Count: 0
Buckets #2, Count: 0
Buckets #3, Count: 2
Buckets #4, Count: 5
Buckets #5, Count: 0
Buckets #6, Count: 3
Buckets #7, Count: 7
Buckets #8, Count: 2
Buckets #9, Count: 4
Buckets #10, Count: 21
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

### Testing

- Several unit tests have been created. We have also tested by ingesting
and converting exponential histograms from the `statsdreceiver` as well
as directly via the `otlpreceiver` over grpc over several hours with a
large amount of data.

- We have clients that have been running this solution in production for
a number of weeks.

### Readme description:

### convert_exponential_hist_to_explicit_hist

`convert_exponential_hist_to_explicit_hist([ExplicitBounds])`

the `convert_exponential_hist_to_explicit_hist` function converts an
ExponentialHistogram to an Explicit (_normal_) Histogram.

`ExplicitBounds` is represents the list of bucket boundaries for the new
histogram. This argument is __required__ and __cannot be empty__.

__WARNING:__

The process of converting an ExponentialHistogram to an Explicit
Histogram is not perfect and may result in a loss of precision. It is
important to define an appropriate set of bucket boundaries to minimize
this loss. For example, selecting Boundaries that are too high or too
low may result histogram buckets that are too wide or too narrow,
respectively.

---------

Co-authored-by: Kent Quirk <kentquirk@gmail.com>
Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
processor/transform Transform processor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[processor/transform] Add Function to convert Exponential Histograms to normal Histograms
7 participants