KAFKA-19605; Fix the busy loop occurring in kraft client observers #20354

kevin-wu24 · 2025-08-14T12:26:37Z

The broker observer should not read update voter set timer value when
polling to determine its backoff, since brokers cannot auto-join the
KRaft voter set. If auto-join or kraft.version=1 is not supported,
controller observers should not read this timer either when polling.

The updateVoterSetPeriodMs timer is not something that should be
considered when calculating the backoff returned by polling, since this
timer does not represent the same thing as the fetchTimeMs timer.

Reviewers: Chia-Ping Tsai chia7712@gmail.com, José Armando García
Sancio jsancio@apache.org, Alyssa Huang ahuang@confluent.io,
Kuan-Po Tseng brandboat@gmail.com

…ling

chia7712

@kevin-wu24 thanks for this fix. I ran the patch locally, and the CPU usage has improved. Is it possible to add a unit test for it?

kevin-wu24 · 2025-08-14T13:02:49Z

@chia7712 thanks for pointing out the issue.

Is it possible to add a unit test for it?

I'm not super sure what a unit test would look like, since the backoff logic is not a correctness thing, but rather an efficiency/performance thing.

chia7712 · 2025-08-14T13:28:18Z

I'm not super sure what a unit test would look like, since the backoff logic is not a correctness thing, but rather an efficiency/performance thing.

It seems to me that the busy loop is a performance issue, as it could lead to high CPU usage. I'm fine with adding a unit test in a follow-up, since the local test looks good. Otherwise, as soon as the broker starts running, my computer's fan spins up, which is a bit alarming.

jsancio

Thanks for the fix @kevin-wu24. Just a minor coding comment.

jsancio · 2025-08-14T14:33:59Z

raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java

+        if (shouldSendAddOrRemoveVoterRequest()) {
+            return Math.min(
+                backoffMs,
+                state.remainingUpdateVoterSetPeriodMs(currentTimeMs)
+            );
+        } else {
+            return backoffMs;
+        }


Can you change the implementation so that shouldSendAddOrRemoveVoterRequest is only evaluated once?

You can also change the back off computation to something like:

return Math.min( backoffMs, shouldSendAddOrRemoveVoterRequest ? state.remainingUpdateVoterSetPeriodMs(currentTimeMs) : Integer.MAX_VALUE );

kevin-wu24 · 2025-08-14T16:02:46Z

Hi @chia7712, @jsancio and I had a discussion offline about some of the "backoff" logic in pollFollowerAsObserver and pollFollowerAsVoter. It is partially related to the bug here, but mainly the problem is that the calculation of the return value for these methods is conceptually incorrect as it is now. Basically, the fetch timeout and update voter set timeout mean two different things conceptually, but the code treats them similarly.

When the fetch timeout expires, it means the voter should transition states. This means the fetch timeout is something we should consider when calculating the backoff.
The update voter set timeout just means that every X amount of time, the replica should try to send an Add/Remove/UpdateVoterRPC. The value of this timer is not something we should consider when calculating the backoff. If we successfully send one of these RPCs, we should wait for sendResult.timeToWaitMs(). If we don't successfully send one of these RPCs because there is a pending request, we should also wait for sendResult.timeToWaitMs(). The value of this timer does not impact how long we should back off for, only the in-flight request's lifetime.

brandboat

Before this patch:

After:

Profiling around 40 seconds, and the performance improvement is significant, thanks for the fix!

brandboat

BTW, now FollowerState#remainingUpdateVoterSetPeriodMs is an unused method, could we remove it?

jsancio

Thanks for the fix. LGTM.

ahuang98 · 2025-08-14T18:22:26Z

raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java

-                state.remainingFetchTimeMs(currentTimeMs),
-                state.remainingUpdateVoterSetPeriodMs(currentTimeMs)
-            )
+            state.remainingFetchTimeMs(currentTimeMs)


I see, the backoffMs accounts for the time to wait before processing the result of any updateVoteRequest

ahuang98

thanks for the improvements!

kevin-wu24 · 2025-08-14T20:33:25Z

@chia7712 can you re-trigger the CI? The Java 24 failure doesn't look related.

chia7712 · 2025-08-15T12:19:00Z

raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java

            if (sendResult.requestSent()) {
                state.resetUpdateVoterSetPeriod(currentTimeMs);
            }
+            return sendResult.timeToWaitMs();


What happens if timeToWaitMs is larger than the fetch timeout? Could the observer miss a fetch request?

The observer does not need to consider the time left on the fetch timer when calculating the backoff, because an observer cannot transition to prospective/candidate state. It must transition to follower state first.

What happens if timeToWaitMs is larger than the fetch timeout?

If this is the case, the observer would have to wait timeToWaitMs anyways so its request manager doesn't have a pending request. Only then can it resume fetching/sending add/remove voter.

broker observer should not read update voter set timer value when pol…

cdb1b43

…ling

github-actions bot added the triage PRs from the community label Aug 14, 2025

kevin-wu24 mentioned this pull request Aug 14, 2025

KAFKA-19078: Automatic controller addition to cluster metadata partition #19589

Merged

github-actions bot added kraft small Small PRs labels Aug 14, 2025

fix last commit

c5e8704

chia7712 added the ci-approved label Aug 14, 2025

chia7712 reviewed Aug 14, 2025

View reviewed changes

chia7712 approved these changes Aug 14, 2025

View reviewed changes

jsancio reviewed Aug 14, 2025

View reviewed changes

code review

dffbba5

kevin-wu24 added 2 commits August 14, 2025 11:08

fix backoff calculation

a46bf21

cleanup

549b65a

brandboat approved these changes Aug 14, 2025

View reviewed changes

brandboat reviewed Aug 14, 2025

View reviewed changes

remove dead code

2bd450a

jsancio approved these changes Aug 14, 2025

View reviewed changes

ahuang98 reviewed Aug 14, 2025

View reviewed changes

ahuang98 approved these changes Aug 14, 2025

View reviewed changes

github-actions bot removed the triage PRs from the community label Aug 15, 2025

chia7712 reviewed Aug 15, 2025

View reviewed changes

jsancio changed the title ~~KAFKA-19605: Fix the busy loop occurring in the broker observer~~ KAFKA-19605; Fix the busy loop occurring in kraft client observers Aug 15, 2025

jsancio merged commit 833e25f into apache:trunk Aug 15, 2025
41 of 43 checks passed

KAFKA-19605; Fix the busy loop occurring in kraft client observers #20354

KAFKA-19605; Fix the busy loop occurring in kraft client observers #20354

Uh oh!

Conversation

kevin-wu24 commented Aug 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chia7712 left a comment

Choose a reason for hiding this comment

Uh oh!

kevin-wu24 commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chia7712 commented Aug 14, 2025

Uh oh!

jsancio left a comment

Choose a reason for hiding this comment

Uh oh!

jsancio Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

kevin-wu24 commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandboat left a comment

Choose a reason for hiding this comment

Uh oh!

brandboat left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jsancio left a comment

Choose a reason for hiding this comment

Uh oh!

ahuang98 Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

ahuang98 left a comment

Choose a reason for hiding this comment

Uh oh!

kevin-wu24 commented Aug 14, 2025

Uh oh!

chia7712 Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

kevin-wu24 Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kevin-wu24 commented Aug 14, 2025 •

edited by github-actions bot

Loading

kevin-wu24 commented Aug 14, 2025 •

edited

Loading

kevin-wu24 commented Aug 14, 2025 •

edited

Loading

brandboat left a comment •

edited

Loading

kevin-wu24 Aug 15, 2025 •

edited

Loading