-
Notifications
You must be signed in to change notification settings - Fork 428
Description
Is your feature request related to a problem? Please describe.
K0s's current implementation of the LeaderElector needs some thoughts and design decisions:
Interface/Contract
The interface in use is as follows:
k0s/pkg/component/controller/leaderelector.go
Lines 29 to 34 in 62bef64
// LeaderElector is the common leader elector component to manage each controller leader status | |
type LeaderElector interface { | |
IsLeader() bool | |
AddAcquiredLeaseCallback(fn func()) | |
AddLostLeaseCallback(fn func()) | |
} |
-
How and when are callbacks invoked?
- Are they called in sequence, or concurrently, i.e. should those callbacks spawn their own goroutine if required, or are they already in a goroutine?
- Will the acquire callbacks be invoked concurrently to the lost callbacks? Is it the responsibility of the LeaderElector or the responsibility of the callback implementors to ensure that nothing is overlapping?
- Acquire and lost callbacks are added separately, not as pairs. This seems odd.
-
When it is allowed to add function callbacks?
The interface looks like it's legit to add new callbacks at any time. This is not on par with the implementations. More on that below. -
How to remove previously added callbacks?
Other components register themselves with the LeaderElector, but they also do have their own lifecycle. What if they get stopped? There's currently no way to unregister from the LeaderElector again. So their callback would be invoked even after they have been stopped. It's not a big deal in the way that k0s is currently implemented, since the only situation in which things are stopped is on shutdown, which inherently means that everything is stopped. Still, this seems like a design flaw, since components registering themselves with the LeaderElector simply cannot be stopped correctly at the moment.
Implementations
There are currently two implementations of the LeaderElector interface, one dummy implementation that pretends that it's always leading, used for the --single
controller, and the real thing, which is backed by client-go's leader elector, wrapped by k0s's "LeasePool". Both of them are Components, so they have a lifecycle. Besides the general points about the interface itself, I see two implementation-specific problems:
-
The dummy leader elector is too simple.
It doesn't implement any lifecycle at all. The only lifecycle-aware thing that it does is that it's calling the acquire callbacks in its run method. It's neither doing similar things for itsIsLeader
method, which is unconditionally returningtrue
, no matter if the component has just been created, it is started, or if it is stopped already, nor is it calling any lost callbacks when it's stopped. -
Adding new callbacks to already running leader electors won't work as expected.
The lease pool backed implementation will at least call them with the next matching event, i.e. if it's already leading, an acquire callback will only be called the next time the lease has been won, not directly after adding it. While this will be eventually be catching up in theory, in practice this means that it's not working, since the lease will be held for a long time, possibly as long as the whole controller process's lifetime. The dummy leader elector will not do anything with those newly added callbacks at all.
Describe the solution you would like
-
I tend to require that all callbacks be called in their own goroutine by all LeaderElectors. We can document that on the interface, the implementation is simple. The benefits are: Users of the LeaderElector don't have to spawn their own goroutine, just to make sure that they aren't blocking the LeaderElector, and the LeaderElector doesn't need to rely on the callbacks that they actually don't block it.
-
and 5.: With the current implementations, it's only possible to add callbacks before the leader electors have been started. I see two ways to improve on that:
- Implement callback addition correctly, no matter in which state the leader elector component is. This would be more versatile and allow for things like nested components, which are started and stopped depending on the cluster configuration.
- Error out or panic when trying to add callbacks when the leader electors are already running. This is not very elegant, doesn't really fit well to the LeaderElector contract and still requires components to be started "in the right order", but at least it catches problems early and makes them visible, instead of just continuing in a broken state.
-
Combine the acquire and lost callbacks into a single callback that gets a context which is cancelled when the lease gets lost, and return a "Registration" object, i.e. like so:
type LeaderElector interface { AddAcquiredLeaseCallback(callback func(context.Context)) Registration // ... } type Registration interface { Cancel() }
This has several advantages. It natively combines the acquire/lost callback pairs into a single callback function, making it easier for both the LeaderElector implementation and callback implementors. The LeaderElector has only one callback type that needs to be managed, instead of two, and there's no need to group callbacks into pairs. The implementor has a ready-to-be-used context at its disposal, and doesn't need to setup something on its own, which it then cancels when the lost callback is invoked. Even straight forward implementations can just block on
<-ctx.Done()
to wait for the lease to be lost. This is actually the technique that's also used inside the vanilla client-go leader elector implementation, but it gets covered up by LeasePool. Having a single callback function makes it easier to define expectations around concurrent calls of the callbacks. It seems easy enough for the LeaderElector implementation to be able to ensure that each callback is never invoked concurrently, i.e. the next invocation only happens after the previous one returned. TheCancel
method can be used to unregister the previously added callback, and allows for a clean way to implement component stopping. -
Replace the dummy leader elector with the real implementation, except that the internal LeasePool will be a "fake" one. That way, both implementation behave the same from a component/lifecycle point of view, i.e. callback registrationm dispatching and so on.
Describe alternatives you've considered
No response
Additional context
No response