-
Notifications
You must be signed in to change notification settings - Fork 594
Adding new Scheduler interface methods for scaling #1321
Adding new Scheduler interface methods for scaling #1321
Conversation
* @param existingContainers Set of containers currently managed by the scheduler | ||
* @param containersToRemove Set of containers to be remove by the scheduler | ||
*/ | ||
void removeContainers(Set<PackingPlan.ContainerPlan> existingContainers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be assumed that the two sets, set for add and remove, will always be disjoint? If so I think it should be clearly documented in javadoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, since nothing disallows someone from adding a bunch of containers and then removing them. I did add a javadoc though to removeContainers to clarify that containersToRemove must be a subset of existing.
Added @VisibleForTesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering:
- Can we use only one method: void removeContainers(Set<PackingPlan.ContainerPlan> existingContainers, Set<PackingPlan.ContainerPlan> containersToRemove);
- Why the arguments for addContainers & removeContainers are not similar, i.e. why removeContainers needs existingContainers while addContainers does not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also mulling over that inconsistency. We first implemented as this:
void removeContainers(int existingContainerCount, int numContainersToRemove);
and for the Aurora impl we could do the math to determine the ids last of the containers to be removed. But Yarn needed to know specifically which ones to remove, so we changed it to the current signature. There was also some thinking that we could assert that existingContainers
passed was the same as the implementation was already managing, as a type of guard against race conditions of multiple edits.
As I think more about it I don't think either of those reasons are totally valid and we should probably remove Set existingContainers
from the signature. @ashvina thoughts on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it would be better to remove existingContainers
. The scheduler will be orchestrating the update operation. In which case it will already have access to existing containers.
However, we may ask if update operation needs to be executed as a transaction? Or mimic a transaction by using some sort of packing plan version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I'll remove existingContainers
from the signature. And we can look into packing plans ids as a means for transactions.
|
||
import com.twitter.heron.spi.packing.PackingPlan; | ||
|
||
public interface ScalableScheduler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of "ScalableScheduler" indicates it is a scheduler interface. However, it is for "Scalable" features only. It could be better to remove Scheduler to make it "Scalable" or "IScalable". Then actual scheduler can :
class XScheduler implements IScheduler, IScalable {..}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent was to say that this is a Scheduler that's scalable, but I see your point. It seems like we could do one of these two options:
IScalable
with the add/remove methodsIScalableScheduler
that extendsIScheduler
and includes the add/remove methods.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a high level, the plan is to add a method onUpdate
to IScheduler
. This method will be invoked by RuntimeManager
. The implementation onUpdate
method will invoke itself to add and remove containers after necessary computations. Given this context I am thinking if a new interface is really needed? Will it be better to add addContainer
and removeContainer
methods to IScheduler
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all schedulers might support the ability to add/remove containers dynamically, which is why I created an interface for those that do. The 'onUpdate' method delegates to the UpdateTopologyManager
with an Optional<ScalableScheduler>
. If it is not present instance scaling is supported but not container scaling. If it is present both are.
What if we go with the IScalable
approach above for now and then we revisit in the next review where we introduce the UpdateTopologyManager
implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for IScalable
. And it also allows the flexbility to have one more interface IScalableScheduler
that extends IScheduler
and IScalable
in future if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. Updated.
Committing a change to go with option 2 actually ( Also moved the interface to |
@maosongfu and @ashvina all comments addresses, let me know if this looks ok to you. |
👍 |
I'm going to merge this. If there are add'l comments we can address in a follow-up. |
Adds new scheduler interface methods for scaling (#1218) and the glue code to integrate. The Schedulers have empty impls for now for
updateTopology
. I'll follow up in another PR with implementations.