-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Global Scheduling #601
Comments
Makes sense to me.
I lean toward multiple entries with a label specifying that this container is globally scheduled. This way you filter containers through labels using |
Agree @abronan |
I agree with @abronan for multiple entries with If a node is dead no need to reschedule the container as the container will be present in another node . Rescheduling global container can be prevented by referencing to the above discussed MAP |
I'd love to go wtih the single entry approach. We've already found the problem of listing a massive number of containers (and images) on our 50-node cluster (via DockerUI). As we are expecting to have at least 200 nodes in the near future, Virtual IDs will be directly a benefit to us. Aggregate the similar containers and maybe have a display like:
Then when |
@chanwit We should allow both for convenience. A single entry approach as a quick overview of the globally scheduled containers ( |
I think having both makes sense, as the client needs to be able to remove the global schedule as well as restart certain ones in the cluster. Just that I'm not sure then once we introduce "virtual" containers then we need to somehow translate all the docker client APIs targeting a single container to do special cases for it now. I'm not sure what should swarm return when docker inspect on that container should show, as it won't have a PID or network, etc. |
+1 @abronan I like the idea for using label to specifying and filtering the containers. |
There is an issue related to container naming. Today, you can't do that because swarm abstracts as if it were a single host. |
After giving in some extra thought, I think global scheduling is just a special case of scaling. We run into the exact same issues if we wanted to scale a container to X instances (single entry, naming, ...). I believe we should address the scaling issue at the same time, maybe there should be a higher-level concept ( |
@aluzzardi You don't necessarily schedule globally because you wish to scale. |
@vbichov, indeed. However, in terms of design, implementation and user experience, those two are quite identical. |
Is scaling a planned feature? just select the top "-e constraint:scale=[number]" hosts from the list returned by the scheduler and run? (and apply pigeonhole principle?) |
Maybe it's time to bring this up. @aluzzardi I am trying to conceptualize how the global scheduling work.
From the design, I have found that global scheduling requires a specific implementation that I cannot relate it to scaling. Please correct me if I am wrong. |
I don't think it is necessary to use We can just label containers with By the way, do you mean that a globally scheduled container should be automatically deployed when a new node is added to the cluster (to match the other machines running those containers)? |
@abronan yep, this may be optional. But I still think it's necessary to share the configuration of global scheduled containers, in case of the swarm master failure.
Yes, it's a normal use case for Big Data / Hadoop clusters. This is also mentioned by @aluzzardi above. |
Oooh, this! I'm all about peek-a-boo services when nodes come online. /cc @smashwilson |
@chanwit Ok my only concern with the auto-run on newly added nodes is that a Swarm can be divided into multiple sets/regions. Thus, a newly added node might be annotated with a label that says that it belongs to another group (not necessarily running the same tasks/workloads). In this case running the globally scheduled container automatically might not be the expected behavior from a user perspective. For the use of |
At code-level, it seems this kind of scheduler will be doing as a super scheduler.
@aluzzardi is it OK for me to start implement this? @abronan If @aluzzardi is OK for me to take care of a PR of this, I'll really need your inputs for it. |
I would also be very interested in an update about this topic. |
+1 I hope this will land in Swarm, have a look what others do:
Here is my shell script to deploy monitoring agents to each node using docker machine, a kind of workaround: |
@megastef In the meantime you can also use a Compose file and scale to the number of nodes you have making sure to add an anti-affinity rule so that no two containers are running on the same machine :) I agree that this would be convenient to have it directly in Swarm but Compose is also a cool way to solve it on top of Swarm. |
@abronan Thank you! This seems to work and is better than the shell-script.
And this commands:
But it needs to run, when number of nodes changes. So I'm still looking forward for the new global scheduling feature ;) |
A problem I noticed with |
I think that should be handled by compose by using a unique project name. That way you'll never have duplicate container names, even if you have the same service name in different projects. |
@dnephin Basically I agree, the problem is, that you do not know the project name inside the |
Found the related issue :) docker/compose#2294 |
+1 |
Given global services as a new capability in 1.12, what remains to implement here? Eliminating naming conflict for an agent's existing containers/services with the same name? |
Given global service implementation in Docker 1.12, there is not much left for Swarm to do. Container name conflict is not part of global service. |
@leecalcote I will test swarm global service with sematext docker agent. Thanks for the hint! |
This feature would allow to schedule a container on every single node in the cluster.
A typical use case would be system containers (such as log collectors).
role==frontend
. In that case, the container would only be scheduled onfrontend
machines (current and future).docker ps
? Single entry or multiple entries? (Depends on Virtual IDs. See Virtual Container IDs #600)ID
andname
is reported? How do we operate them? For instance, whatdocker logs <global container>
would do? How do we see the status of each one?ID
is a global container? How do we remove the global container? That is, `docker rm would remove only that particular instance, not all the other ones. Also, it wouldn't prevent scheduling that global container into new machines.The text was updated successfully, but these errors were encountered: