*: improve latency when streaming series from Prometheus #3146

simonpasquier · 2020-09-09T13:51:00Z

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

Changes

I've found that when requesting many series (in the order of ten
thousands), the Thanos sidecar spends half of its time computing the
number of received series. To calculate the number of series, it needs
to build a label-based identifier for each chunked series and compare it
with the previous identifier. Eventually this number is only used for
logging and tracing so it doesn't feel like it's worth the penalty.

This change adds an histogram metric,
thanos_sidecar_prometheus_store_received_frames, which tracks the number
of frames per request received from the Prometheus remote read API
(buckets: 10, 100, 1000, 10000, 100000). It can be used to evaluate
whether expensive Series requests are performed.

Verification

CPU profile before this PR (half of the time is spent in prompb.(*Label).String)

CPU profile after this PR

CPU graph showing the 50% improvement (the patched version of the Thanos sidecar has been deployed around 15:50)

https://snapshot.raintank.io/dashboard/snapshot/mufpH3k2yRJUYLM284SXESwVLG0Ns14i?orgId=2

simonpasquier · 2020-09-10T07:10:00Z

~~prometheus-operator/prometheus-operator#3457 is the prometheus-operator PR including this change.~~
wrong place :)

I've found that when requesting many series (in the order of ten thousands), the Thanos sidecar spends half of its time computing the number of received series. To calculate the number of series, it needs to build a label-based identifier for each chunked series and compare it with the previous identifier. Eventually this number is only used for logging and tracing so it doesn't feel like it's worth the penalty. This change adds an histogram metric, `thanos_sidecar_prometheus_store_received_frames`, which tracks the number of frames per request received from the Prometheus remote read API (buckets: 10, 100, 1000, 10000, 100000). It can be used to evaluate whether expensive Series requests are being performed. Signed-off-by: Simon Pasquier <spasquie@redhat.com>

simonpasquier · 2020-09-15T13:33:16Z

@bwplotka friendly ping :)

bwplotka

Wow.

Amazing. Just curious how you found this? (: Just putting timers manually?

EDIT: Nvm, read your description. Thanks!

bwplotka · 2020-09-15T20:11:20Z

Thanks, amazingly spotted ❤️

simonpasquier force-pushed the improve-prometheus-store branch from 8242808 to 6389103 Compare September 9, 2020 13:52

kakkoyun requested a review from bwplotka September 9, 2020 14:22

simonpasquier force-pushed the improve-prometheus-store branch 4 times, most recently from ae9dbe4 to 74efa25 Compare September 10, 2020 07:05

simonpasquier force-pushed the improve-prometheus-store branch from 74efa25 to f2ab444 Compare September 15, 2020 11:20

bwplotka approved these changes Sep 15, 2020

View reviewed changes

bwplotka merged commit 1078fe7 into thanos-io:master Sep 15, 2020

simonpasquier deleted the improve-prometheus-store branch September 21, 2020 08:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

*: improve latency when streaming series from Prometheus #3146

*: improve latency when streaming series from Prometheus #3146

simonpasquier commented Sep 9, 2020 •

edited

Loading

simonpasquier commented Sep 10, 2020 •

edited

Loading

simonpasquier commented Sep 15, 2020

bwplotka left a comment •

edited

Loading

bwplotka commented Sep 15, 2020

*: improve latency when streaming series from Prometheus #3146

*: improve latency when streaming series from Prometheus #3146

Conversation

simonpasquier commented Sep 9, 2020 • edited Loading

Changes

Verification

simonpasquier commented Sep 10, 2020 • edited Loading

simonpasquier commented Sep 15, 2020

bwplotka left a comment • edited Loading

Choose a reason for hiding this comment

bwplotka commented Sep 15, 2020

simonpasquier commented Sep 9, 2020 •

edited

Loading

simonpasquier commented Sep 10, 2020 •

edited

Loading

bwplotka left a comment •

edited

Loading