Skip to content

Commit

Permalink
Document ExtendedResourceToleration functionality (flyteorg#924)
Browse files Browse the repository at this point in the history
* Document ExtendedResourceToleration functionality

ExtendedResourceToleration can be used to automatically apply gpu tolerations, otherwise configure Flyte.

Signed-off-by: Andrew Dye <andrewwdye@gmail.com>

* Formatting

Signed-off-by: Andrew Dye <andrewwdye@gmail.com>

* Fix punctuation

Signed-off-by: Andrew Dye <andrewwdye@gmail.com>

Signed-off-by: Andrew Dye <andrewwdye@gmail.com>
  • Loading branch information
andrewwdye authored Nov 23, 2022
1 parent 8337b64 commit a3b9794
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions cookbook/deployment/configure_use_gpus.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@
treat machines with GPUs and machines with CPUs equally. You may want to reserve machines with GPUs for tasks
that explicitly request GPUs. To achieve this, Flyte uses the Kubernetes concept of `taints and tolerations <https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/>`__.
You can configure Flyte backend to automatically schedule your task onto a node with GPUs by tolerating specific taints.
This configuration is controlled under generic k8s plugin configuration as can be found `here <https://github.com/flyteorg/flyteplugins/blob/5a00b19d88b93f9636410a41f81a73356a711482/go/tasks/pluginmachinery/flytek8s/config/config.go#L120>`__.
Kubernetes can automatically apply tolerations for extended resources like GPUs using the `ExtendedResourceToleration plugin <https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#extendedresourcetoleration>`__, enabled by default in some cloud environments. Make sure the GPU nodes are tainted with a key matching the resource name, i.e., ``key: nvidia.com/gpu``.
You can also configure Flyte backend to apply specific tolerations. This configuration is controlled under generic k8s plugin configuration as can be found `here <https://github.com/flyteorg/flyteplugins/blob/5a00b19d88b93f9636410a41f81a73356a711482/go/tasks/pluginmachinery/flytek8s/config/config.go#L120>`__.
The idea of this configuration is that whenever a task that can execute on Kubernetes requests for GPUs, it automatically
adds the matching toleration for that resource (in this case, ``gpu``) to the generated PodSpec.
Expand Down

0 comments on commit a3b9794

Please sign in to comment.