Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize replication throttling in Executor to reduce Kafka admin requests #2214

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aswinayyolath
Copy link
Contributor

Summary

  • Why:
    The current implementation of CC sets and clears replication throttles for each execution task individually, resulting in a high volume of admin requests and slowing down the rebalancing process. This change optimizes throttling by only setting and clearing throttles once per broker during the start and end of a group of tasks, improving efficiency.

  • What:
    Updated Executor.java to introduce logic for setting throttles per broker at the beginning of task execution and clearing throttles only after all tasks for a broker are complete.

Expected Behavior

With this improvement, the replication throttles are set once per broker and cleared after all related tasks are finished, reducing admin request load and improving rebalance speed.

Actual Behavior

Currently, throttling is set and cleared for every individual task, which can cause unnecessary delays during rebalancing due to excessive Kafka admin requests.

Additional Evidence

I have ran the test to make sure all the tests pass.

image
image

Additional notes

Made changes like

partitionMovementTasksByState.getOrDefault(ExecutionTaskState.PENDING, 0)

Now the code uses getOrDefault for accessing task states, which avoids potential NullPointerExceptions if any state is not present in the map.

Categorization

  • documentation
  • bugfix
  • new feature
  • refactor
  • security/CVE
  • other

This PR resolves

This PR resolves #1972

…uests

Optimizes replication throttling to enhance efficiency during
rebalance by reducing redundant Kafka admin requests, thereby
improving overall task execution time.

Signed-off-by: Aswin A <aswin6303@gmail.com>
Copy link
Contributor

@CCisGG CCisGG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments. Also please unit tests for it.

@@ -1644,7 +1656,11 @@ private void interBrokerMoveReplicas() throws InterruptedException, ExecutionExc
.collect(Collectors.toList());
inProgressTasks.addAll(inExecutionTasks());

throttleHelper.clearThrottles(completedTasks, inProgressTasks);
// Clear throttles only after all tasks are complete for the brokers involved
if (partitionsToMove == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may want to check all of these conditions:

(partitionsToMove == 0 && inExecutionTasks().isEmpty()) || (_stopSignal.get() != NO_STOP_EXECUTION)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay 👍

throttleHelper.setThrottles(tasksToExecute.stream().map(ExecutionTask::proposal).collect(Collectors.toList()));
// Set throttles only if not already set for brokers involved
List<ExecutionProposal> proposals = tasksToExecute.stream().map(ExecutionTask::proposal).collect(Collectors.toList());
for (ExecutionProposal proposal : proposals) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it makes more sense to add a map inside throttle helper, so that you don't need to expose the logic in executor. I think the thorttle helper should be responsible to remember which broker is already throttled and just don't send redundant request to those brokers. Same with the clear throttle logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Redundant set/unset throttling during the rebalance
2 participants