diff --git a/website/content/docs/internals/plugins/csi.mdx b/website/content/docs/internals/plugins/csi.mdx index ae21ec59fe..fa22319593 100644 --- a/website/content/docs/internals/plugins/csi.mdx +++ b/website/content/docs/internals/plugins/csi.mdx @@ -55,17 +55,25 @@ that perform both the controller and node roles in the same instance. Not every plugin provider has or needs a controller; that's specific to the provider implementation. -You should always run node plugins as Nomad `system` jobs and use the -`-ignore-system` flag on the `nomad node drain` command to ensure that the -node plugins are still running while the node is being drained. Use -constraints for the node plugin jobs based on the availability of volumes. For -example, AWS EBS volumes are specific to particular availability zones with a -region. Controller plugins can be run as `service` jobs. +Plugins mount and unmount volumes but are not in the data path once +the volume is mounted for a task. Plugin tasks are needed when tasks +using their volumes stop, so plugins should be left running on a Nomad +client until all tasks using their volumes are stopped. The `nomad +node drain` command handles this automatically by stopping plugin +tasks last. + +Typically, you should run node plugins as Nomad `system` jobs so they +can mount volumes on any client where they are running. Controller +plugins can create and attach volumes anywhere they can communicate +with the storage provider's API, so they can usually be run as +`service` jobs. You should always run more than one controller plugin +allocation for high availability. Nomad exposes a Unix domain socket named `csi.sock` inside each CSI plugin task, and communicates over the gRPC protocol expected by the CSI specification. The `mount_dir` field tells Nomad where the plugin -expects to find the socket file. +expects to find the socket file. The path to this socket is exposed in +the container as the `CSI_ENDPOINT` environment variable. ### Plugin Lifecycle and State @@ -94,7 +102,7 @@ server and waits for a response; the allocation's tasks won't start until the volume has been claimed and is ready. If the volume's plugin requires a controller, the server will send an -RPC to the Nomad client where that controller is running. The Nomad +RPC to any Nomad client where that controller is running. The Nomad client will forward this request over the controller plugin's gRPC socket. The controller plugin will make the request volume available to the node that needs it. @@ -110,13 +118,14 @@ client, and the node plugin mounts the volume to a staging area in the Nomad data directory. Nomad will bind-mount this staged directory into each task that mounts the volume. -This cycle is reversed when a task that claims a volume becomes terminal. The -client will send an "unpublish" RPC to the server, which will send "detach" -RPCs to the node plugin. The node plugin unmounts the bind-mount from the -allocation and unmounts the volume from the plugin (if it's not in use by -another task). The server will then send "unpublish" RPCs to the controller -plugin (if any), and decrement the claim count for the volume. At this point -the volume’s claim capacity has been freed up for scheduling. +This cycle is reversed when a task that claims a volume becomes +terminal. The client frees the volume locally by making "unpublish" +RPCs to the node plugin. The node plugin unmounts the bind-mount from +the allocation and unmounts the volume from the plugin (if it's not in +use by another task). The client will then send an "unpublish" RPC to +the server, which will forward it to the the controller plugin (if +any), and decrement the claim count for the volume. At this point the +volume’s claim capacity has been freed up for scheduling. [csi-spec]: https://github.com/container-storage-interface/spec [csi-drivers-list]: https://kubernetes-csi.github.io/docs/drivers.html diff --git a/website/content/docs/job-specification/csi_plugin.mdx b/website/content/docs/job-specification/csi_plugin.mdx index be0774c107..55cf152c59 100644 --- a/website/content/docs/job-specification/csi_plugin.mdx +++ b/website/content/docs/job-specification/csi_plugin.mdx @@ -51,28 +51,28 @@ option. ## Recommendations for Deploying CSI Plugins -CSI plugins run as Nomad jobs but after mounting the volume are not in the -data path for the volume. Jobs that mount volumes write and read directly to +CSI plugins run as Nomad tasks, but after mounting the volume are not in the +data path for the volume. Tasks that mount volumes write and read directly to the volume via a bind-mount and there is no communication between the job and the CSI plugin. But when an allocation that mounts a volume stops, Nomad will need to communicate with the plugin on that allocation's node to unmount the volume. This has implications on how to deploy CSI plugins: -* During node drains, jobs that claim volumes must be moved before the `node` - or `monolith` plugin for those volumes. You should run `node` or `monolith` - plugins as [`system`][system] jobs and use the `-ignore-system` flag on - `nomad node drain` to ensure that the plugins are running while the node is - being drained. +* If you are stopping jobs on a node, you must stop tasks that claim + volumes before stopping the `node` or `monolith` plugin for those + volumes. If you use the `node drain` feature, plugin tasks will + automatically be drained last. -* Only one plugin instance of a given plugin ID and type (controller or node) - should be deployed on any given client node. Use a constraint as shown - below. +* Only the most recently-placed allocation for a given plugin ID and + type (controller or node) will be used by any given client node. Run + `node` plugins as system jobs and distribute `controller` plugins + across client nodes using a constraint as shown below. * Some plugins will create volumes only in the same location as the - plugin. For example, the AWS EBS plugin will create and mount volumes only - within the same Availability Zone. You should deploy these plugins with a - unique-per-AZ `plugin_id` to allow Nomad to place allocations in the correct - AZ. + plugin. For example, the AWS EBS plugin will create and mount + volumes only within the same Availability Zone. You should configure + your plugin task as recommended by the plugin's documentation to use + the [`topology_request`] field in your volume specification. ## `csi_plugin` Examples @@ -124,3 +124,4 @@ job "plugin-efs" { [csi]: https://github.com/container-storage-interface/spec [csi_volumes]: /docs/job-specification/volume [system]: /docs/schedulers#system +[`topology_request`]: /docs/commands/volume/create#topology_request