Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow configuring controller's concurrency #1277

Merged
merged 4 commits into from
Sep 5, 2023
Merged

Conversation

aryan9600
Copy link
Contributor

Motivation

Its often desirable to control the max no of reconciliations that the controller can run at a given moment to see more predictable behavior or to better utilize the host machine's resources.

Fixes #1248

Solution

Add concurrency to controller::Config which defines a limit on the number of concurrent reconciliations that the controller can execute at any given moment. Its default by 0, which lets the controller run with unbounded concurrency.

@codecov
Copy link

codecov bot commented Aug 12, 2023

Codecov Report

Merging #1277 (9e3334b) into main (0a5fb72) will increase coverage by 0.22%.
The diff coverage is 87.96%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1277      +/-   ##
==========================================
+ Coverage   72.21%   72.43%   +0.22%     
==========================================
  Files          75       75              
  Lines        6258     6337      +79     
==========================================
+ Hits         4519     4590      +71     
- Misses       1739     1747       +8     
Files Changed Coverage Δ
kube-runtime/src/controller/mod.rs 34.19% <67.74%> (-0.02%) ⬇️
kube-runtime/src/scheduler.rs 97.16% <86.66%> (-0.68%) ⬇️
kube-runtime/src/controller/runner.rs 94.94% <98.33%> (+1.28%) ⬆️
kube-runtime/src/controller/future_hash_map.rs 95.45% <100.00%> (+0.21%) ⬆️

... and 1 file with indirect coverage changes

Copy link
Member

@clux clux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey again! i like the general approach here. only had time to put some quick comments on the public interface (which only has a few nits) as i'm not super online this week. hoping to leave the small internal extension there to natalie.

kube-runtime/src/controller/mod.rs Show resolved Hide resolved
kube-runtime/src/controller/mod.rs Outdated Show resolved Hide resolved
kube-runtime/src/controller/mod.rs Outdated Show resolved Hide resolved
kube-runtime/src/controller/runner.rs Outdated Show resolved Hide resolved
@clux clux added the changelog-add changelog added category for prs label Aug 14, 2023
@clux clux added this to the 0.86.0 milestone Aug 14, 2023
@clux clux requested a review from nightkr August 14, 2023 08:03
Copy link
Member

@clux clux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally gotten around to really look at this, and even tested this out a little bit in a busy controller. Functionally, it looks great 👍. It seems to be behaving the desired way by flattening the curve (compressing the reconciliation rate graph down).

The only points i have to make here are about the 1. complexity of the implementation and 2. if backpressure will become a problem here.

  1. This is already a very complicated part of the codebase, and would like to have the helpers as contained as possible. Have put some questions/comments in various places.
  2. If users set concurrency: 1 (controller runtime's default btw) in a highly parallel controller you might end up in a bad failure mode; throttling the controller, but continuing to fill the scheduler's pending set (afaikt). Maybe we need a configurable max limit on the amount of pending reconciliations and just return a new error type after this? Or maybe we let it grow for now and defer to backpressure mechanisms in the applier or issues at a later time (this is an opt-in thing after all, and we already have deduping to help us out). I am not sure, but I am curious to see what does controller-runtime does here - if they do anything.

kube-runtime/src/controller/runner.rs Outdated Show resolved Hide resolved
kube-runtime/src/controller/runner.rs Show resolved Hide resolved
kube-runtime/src/scheduler.rs Outdated Show resolved Hide resolved
kube-runtime/src/scheduler.rs Outdated Show resolved Hide resolved
kube-runtime/src/scheduler.rs Show resolved Hide resolved
kube-runtime/src/controller/runner.rs Show resolved Hide resolved
kube-runtime/src/controller/runner.rs Show resolved Hide resolved
kube-runtime/src/scheduler.rs Show resolved Hide resolved
@nightkr
Copy link
Member

nightkr commented Aug 29, 2023

@clux:

If users set concurrency: 1 (controller runtime's default btw) in a highly parallel controller you might end up in a bad failure mode; throttling the controller, but continuing to fill the scheduler's pending set (afaikt). Maybe we need a configurable max limit on the amount of pending reconciliations and just return a new error type after this?

pending is largely the best place waiting lot for that backpressure, IMO. If we propagate the backpressure beyond the scheduler then we end up with 1) stale caches (because the reflector would also be backpressured), 2) less deduplication, 3) more memory usage (for modified objects we'd store both the old version in the reflector cache and the new version in the queue).

The memory cost of the pending queue (a few strings per object) is also trivial compared to the reflector cache itself (a full copy of each object, regardless of whether it's even queued at all).

@clux
Copy link
Member

clux commented Aug 30, 2023

Thanks @nightkr, that is reassuring. I'll resolve the comments related to congestion or backpressure.

Add `concurrency` to `controller::Config` which defines a limit on the
number of concurrent reconciliations that the controller can execute at
any given moment. Its default by 0, which lets the controller run with
unbounded concurrency.

Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
Copy link
Member

@clux clux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. This is looking good to go from my end 👍

@clux clux merged commit e531d83 into kube-rs:main Sep 5, 2023
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-add changelog added category for prs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow setting controller concurrency
3 participants