Include and stabilize `experimental-compaction-sleep-interval` flag in releases #18481

JalinWang · 2024-08-21T11:54:22Z

What would you like to be added?

Two parameters govern the auto compaction process: experimental-compaction-batch-limit and experimental-compaction-sleep-interval. Despite being added three years ago in this PR commit, the sleep interval flag has yet to be included in any releases. Meanwhile, the batch limit flag is under stabilization consideration in issue, and I propose stabilizing the experimental-compaction-sleep-interval as well.

Why is this needed?

Compaction significantly affects service response time. Distributing pressure more evenly is desired, where these two params serve. While workarounds exist currently, retention window size has limit flexibility and it's better to utilize the built-in mechanism over additional independent maintenance scripts.

The text was updated successfully, but these errors were encountered:

ivanvc · 2024-08-29T18:24:38Z

Discussed during the fortnightly triage meeting. I'll review the PR.

JalinWang · 2024-08-30T02:37:05Z

Discussed during the fortnightly triage meeting. I'll review the PR.

Thanks for the update! I'm looking forward to your feedback~

Also, the following PR for bbolt can greatly improve etcd performance in our scenario where free space is considerable (dbSize - dbSizeInUse) in some time. If possible, could you also mention the release for 1.4.0? The alpha0 was released in January and alpha1 in May, so it seems the next version could be expected in September. That would be a great step toward a stable 1.4.0. (Although we'll still need to wait for etcd 3.6 😫 )

## v1.4.0-alpha.0(2024-01-12) change log
- [Record the count of free page to improve the performance of hashmapFreeCount]
([https://github.com/etcd-io/bbolt/pull/585 ](https://github.com/etcd-io/bbolt/pull/585)).

Attachment: our pprof result screenshot ( dbSize ~11GB, dbSizeInUse ~6GB)

ivanvc · 2024-09-03T23:25:29Z

@JalinWang, can you help with the CHANGELOG pull request to mention #18514?

Regarding the bbolt change, I'd suggest opening an issue on its repository.

Thanks!

JalinWang · 2024-09-09T02:21:35Z

@JalinWang, can you help with the CHANGELOG pull request to mention #18514?

Sorry for the late PR. Plz review: #18556 :)

Regarding the bbolt change, I'd suggest opening an issue on its repository.

okkkkk~

elias-dbx · 2024-10-15T21:20:58Z

Hello, is there any guidance on how to tweak --experimental-compaction-batch-limit and --experimental-compaction-sleep-interval for large clusters?

We have ~40GB etcd databases which create around 2000 new revisions per second. We run compaction once every 30 minutes but see availability drops due to pauses during compaction time.

JalinWang · 2024-10-16T08:36:28Z

Hello, is there any guidance on how to tweak --experimental-compaction-batch-limit and --experimental-compaction-sleep-interval for large clusters?

Hi~
Personally, I adjusted --experimental-compaction-sleep-interval to a higher value and decreased --experimental-compaction-batch-limit to distribute the compaction load evenly across the whole auto compaction interval (typcial 1h) . This should minimize the spikes of RT during compaction tasks.

I found an article online link (in Chinese, use google translater maybe) about optimizing etcd for large clusters(~10k nodes), which mentioned the "compaction-sleep-interval" param. However, it doesn't provide any specific guidance on tuning these two parameters. If you come across any other resources, please share with me :)

elias-dbx · 2024-10-17T20:36:55Z

Once we upgrade to 3.5.16 I will try tweaking the compaction sleep interval and report back. We run up to 15k nodes in our k8s clusters.

ivanvc · 2024-10-18T04:22:36Z

I'll close this issue as the backport is complete and is already part of the 3.5.16 release. Please reopen if you feel there's more work to do.

Thanks, @JalinWang, for your contribution.

JalinWang added the type/feature label Aug 21, 2024

JalinWang mentioned this issue Aug 23, 2024

Plan to release etcd v3.5.16 #18485

Closed

2 tasks

ArkaSaha30 mentioned this issue Aug 29, 2024

[3.5] Introduce the CompactionSleepInterval flag #18514

Merged

ivanvc added the stage/triaged label Aug 29, 2024

ivanvc assigned JalinWang Oct 18, 2024

ivanvc closed this as completed Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include and stabilize `experimental-compaction-sleep-interval` flag in releases #18481

Include and stabilize `experimental-compaction-sleep-interval` flag in releases #18481

JalinWang commented Aug 21, 2024 •

edited

Loading

ivanvc commented Aug 29, 2024

JalinWang commented Aug 30, 2024 •

edited

Loading

ivanvc commented Sep 3, 2024

JalinWang commented Sep 9, 2024

elias-dbx commented Oct 15, 2024

JalinWang commented Oct 16, 2024 •

edited

Loading

elias-dbx commented Oct 17, 2024

ivanvc commented Oct 18, 2024

Include and stabilize experimental-compaction-sleep-interval flag in releases #18481

Include and stabilize experimental-compaction-sleep-interval flag in releases #18481

Comments

JalinWang commented Aug 21, 2024 • edited Loading

What would you like to be added?

Why is this needed?

ivanvc commented Aug 29, 2024

JalinWang commented Aug 30, 2024 • edited Loading

ivanvc commented Sep 3, 2024

JalinWang commented Sep 9, 2024

elias-dbx commented Oct 15, 2024

JalinWang commented Oct 16, 2024 • edited Loading

elias-dbx commented Oct 17, 2024

ivanvc commented Oct 18, 2024

Include and stabilize `experimental-compaction-sleep-interval` flag in releases #18481

Include and stabilize `experimental-compaction-sleep-interval` flag in releases #18481

JalinWang commented Aug 21, 2024 •

edited

Loading

JalinWang commented Aug 30, 2024 •

edited

Loading

JalinWang commented Oct 16, 2024 •

edited

Loading