Add rate limiting for scrape config updates #2189

swiatekm · 2023-10-03T10:26:34Z

Limit the rate on updates to scrape configs. The idea and code are very similar to what Prometheus' Service Discovery manager does: https://github.com/prometheus/prometheus/blob/79be1b835789d7c3fde2a907003a8799c308733f/discovery/manager.go#L341. The end result is that we emit events about changes to Prometheus CRs at most every 5 seconds.

This should help with #1544

jaronoff97 · 2023-10-10T14:51:28Z

@swiatekm-sumo were you able to test this in a cluster? I'd love to see some metrics about how this change affected usage.

swiatekm · 2023-10-10T15:03:49Z

@swiatekm-sumo were you able to test this in a cluster? I'd love to see some metrics about how this change affected usage.

I did a basic smoke test, but I didn't try any synthetic stress test where I'd constantly update a bunch of ServiceMonitors and look at TA CPU usage. I can do that and post some numbers if you're interested.

jaronoff97 · 2023-10-10T15:12:49Z

yeah i'd appreciate that if you don't mind :)

swiatekm · 2023-10-15T11:40:57Z

@jaronoff97 Did a very simple benchmark where I updated the labels on a particular ServiceMonitor as fast as I could via kubectl. Pre-change this used 400m worth of CPU, post change, less than 2m. See attached Prometheus graph:

cmd/otel-allocator/watcher/promOperator.go

Co-authored-by: Ben B. <bongartz@klimlive.de>

* Add rate limiting for scrape config updates * Rename constant to lowercase Co-authored-by: Ben B. <bongartz@klimlive.de> --------- Co-authored-by: Ben B. <bongartz@klimlive.de>

swiatekm force-pushed the feat/ta/update-rate-limit branch 4 times, most recently from 6a998ce to fce41e0 Compare October 7, 2023 13:40

swiatekm marked this pull request as ready for review October 7, 2023 14:25

swiatekm requested review from a team October 7, 2023 14:25

pavolloffay approved these changes Oct 9, 2023

View reviewed changes

jaronoff97 approved these changes Oct 10, 2023

View reviewed changes

swiatekm force-pushed the feat/ta/update-rate-limit branch from fce41e0 to 173ad39 Compare October 15, 2023 11:37

frzifus reviewed Oct 15, 2023

View reviewed changes

cmd/otel-allocator/watcher/promOperator.go Outdated Show resolved Hide resolved

cmd/otel-allocator/watcher/promOperator.go Outdated Show resolved Hide resolved

swiatekm requested a review from frzifus October 17, 2023 10:35

swiatekm and others added 2 commits October 17, 2023 12:35

Add rate limiting for scrape config updates

6ff6126

Rename constant to lowercase

ee8b2cb

Co-authored-by: Ben B. <bongartz@klimlive.de>

swiatekm force-pushed the feat/ta/update-rate-limit branch from 82dd785 to ee8b2cb Compare October 17, 2023 10:35

jaronoff97 merged commit 19f05f2 into open-telemetry:main Oct 18, 2023
24 checks passed

swiatekm deleted the feat/ta/update-rate-limit branch October 18, 2023 09:20

jaronoff97 mentioned this pull request Oct 24, 2023

[target allocator] Scrape configuration hashing is resource intense #1544

Closed

swiatekm mentioned this pull request Apr 12, 2024

Promote @swiatekm-sumo to maintainer #2847

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rate limiting for scrape config updates #2189

Add rate limiting for scrape config updates #2189

swiatekm commented Oct 3, 2023 •

edited

Loading

jaronoff97 commented Oct 10, 2023

swiatekm commented Oct 10, 2023

jaronoff97 commented Oct 10, 2023

swiatekm commented Oct 15, 2023

Add rate limiting for scrape config updates #2189

Add rate limiting for scrape config updates #2189

Conversation

swiatekm commented Oct 3, 2023 • edited Loading

jaronoff97 commented Oct 10, 2023

swiatekm commented Oct 10, 2023

jaronoff97 commented Oct 10, 2023

swiatekm commented Oct 15, 2023

swiatekm commented Oct 3, 2023 •

edited

Loading