-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/translator/prometheusremotewrite] inefficient memory use during conversion to native histograms #24405
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Pinging code owners for exporter/prometheusremotewrite: @Aneurysm9 @rapphil. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
cc @sh0rez |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
still relevant. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
still relevant. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I can have a look at this. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
Component(s)
pkg/translator/prometheusremotewrite
Describe the issue you're reporting
During implementation of #17565 I noticed that much (~50%) of time is spent on reallocating slices for the resulting spans and deltas.
Background
From conversation with @charleskorn :
I see your PR grafana/mimir#5531, there's a similar issue in translating exponential histograms to native histograms where the spans and deltas are getting reallocated all the time. Except it's hard to give a good estimate there about the size needed (although upper bound is obvious) due to the difference in otel and prom representations + downscaling possibly .
How bad is it if we take the upper bound as the estimate?
Haven't calculated yet, kind of on my backlog. For deltas the reduction is potentially 2^n where n=scale difference between exponential and native histogram (in my code it's the scaledown): https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/65e6596c028[…]22439f45e788/pkg/translator/prometheusremotewrite/histograms.go . But there's a chance some buckets are not neatly merge-able , so you cannot say that the reduction is exactly 2^n .
For spans it might be that all buckets fit in a span because they are consecutive. On the other hand, you might find that every bucket needs its own span. It might be worth calculating the number of deltas and spans in a first pass and just filling it later. Or maybe a better strategy is to allocate in batches (+10%?, double?), no idea.
It'd be interesting to compare doing a first pass to calculate the number of deltas and spans and being able to allocate accurately with guessing or assuming the worst case (edited)
For the batches, something that worked well for a similar problem when streaming chunks from [Mimir] ingesters to queriers was to use a linked list of batches of a fixed size, then create one final slice with the correct size and copy the elements from the batches - this saved a bunch of repetitive work copying elements into expanded slices that only had to be copied again when we expanded the slice again. These batches can also be pooled, so over time you only pay the cost of the final slice (edited)
But given the difference in the amount of data involved here, might not be worth it (in the ingesters, we were dealing with substantial structs if I remember correctly)
The text was updated successfully, but these errors were encountered: