-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing scheduling of build pods #946
Comments
@betatim and I were speaking a lot about this, and these are my notes that can help us implement something. NotesImplementation idea (v1)
Algorithm pro / cons
Future improvements
Standalone points
|
It was great discussing this and seeing how we started with a fairly complicated idea like "write a custom scheduler" and now have a much simpler solution! There is another implementation of rendezvous hashing here. To get the possible values of the The following would be my suggestion for how to split this into several steps that can be tackled individually:
What do you think? And do you want to tackle one of these already? Maybe we can start a new issue on "Resource requests for build and DIND pods" to discuss what the options are and how to configure things. |
@betatim I think this may be quite fun to implement, but I have a long list of things to work on already so I figure I'll leave this to you :) I'd be very happy to review whenever work is done and continue the discussion on implementation aspects. |
I've started work on 1 (and a little bit of 2). PR coming soon. |
I think #949 and follow ups implemented this so I'll close this. Maybe we can make new issues with some of the possible improvements/ideas we had beyond what is implemented. |
This issue is about how to optimize the scheduling of BinderHub specific Build Pods (BPs).
Out of scope of this issue is the discussion on how to schedule the user pods, which could be done with image locality (
ImageLocalityPriority
configuration) in mind.Scheduling goals
It is believed that often a builder pod rebuilding a repo would work faster if it had previously built a repository even though the repo has changed, for example in the README.md file of the repo. So, we want to schedule build pods on nodes where it has previously been scheduled if possible.
We would also like to avoid schedule pods on nodes that may want to scale down.
What is our desired scheduling practice?
We need to answer how we actually want the build-pods to schedule, it is not a obvious way to do it is typically hard to optimize both for performance and auto-scaling viability for example.
Boilerplate desired scheduling practice
I'll now provide a boilerplate idea to start out from on how to schedule the build pods.
Technical solution implementation details
We must utilize a non-default scheduler. We could use the default kube-scheduler binary and customize its behavior through configuration, or we could make our own. I think we add too much complexity if we are to make our own though, even though making your own scheduler is certainly possible.
We utilize a custom scheduler, but like z2jh's scheduler we deploy a official kube-scheduler binary with a customized configuration that we can reference from the build pods specification using the
spec.schedulerName
field.We customize the kube-scheduler binary, just like in z2jh through a provided config, but we try to use node annotations somehow. For example, we could make the build pod annotate the node it runs on with the repo it attempts to build, and then later the scheduler can attempt to schedule on this repo.
We make the BinderHub builder pod annotate the node by communicating with the k8s API. To allow the BinderHub to communicate like this, it will require some RBAC details setup, for example like the z2jh's user-schedulers RBAC. It will need a ServiceAccount, a ClusterRole, and a ClusterRoleBinding, where the ClusterRole will define it is should be allowed to read and write annotations on nodes. For an example of a pod communicating with the k8s API, we can learn relevant parts from the z2jh's image-awaiter which also communicates with the k8s API.
We make the BinderHub image-cleaner binary also cleanup associated node annotations along with it cleaning nodes, which would also require associated RBAC permissions like the build pods would need to annotate in the first place.
Knowledge reference
kube-scheduler configuration
https://kubernetes.io/docs/concepts/scheduling/kube-scheduler/#kube-scheduler-implementation
kube-scheduler source code
https://github.com/kubernetes/kubernetes/tree/master/pkg/scheduler
Kubernetes scheduling basics
From this video, you should pick up the role of a scheduler, and that the default binary that can act as a scheduler can consider predicates (aka. filtering) and priorities (aka. scoring).
Kubernetes scheduling deep dives
Past related discussion
#854
The text was updated successfully, but these errors were encountered: