Insufficient data on some RPCs #216

Akrog · 2018-04-04T14:10:54Z

Hi,

I work on the Cinder OpenStack project and I've started to look into writing a CO agnostic Python CSI driver using the Cinder storage drivers directly instead of using a full Cinder deployment (standalone or one that's part of an OpenStack deployment).

So the main difference with all the other Cinder CSI implementations is that this would not require to run RabbitMQ or the 3 Cinder services (API, Scheduler, and Volume), and ideally not even a DBMS, but instead import the Cinder driver Python code directly.

Using the Cinder Python driver code directly means that the drivers will behave in the same way as they do within the Cinder service, where the resource ID is generated outside of the driver and where they have a similar mechanism to provide metadata back on the calls (like CSI’s volume attributes and publish_info), but this information is then passed to all future driver calls related to that resource, and each driver may return different data or none at all in those driver calls.

For that purpose I was considering leveraging existing mechanisms described in the CSI specs, like the volume attributes that can be returned in the CreateVolume call and the publish_info from the ControllerPublishVolume, to store metadata.

Unfortunately current CSI specs is too limited regarding the passing of the volume attributes and the publish_info. I'll try to go over each one of the limitations I've found providing concrete examples:

Missing volume attributes on DeleteVolume: In some cases the id may not provide sufficient information to uniquely identify a Volume in the backend, thus it gets passed to the ControllerPublishVolume, but what about deleting the volume? We would also need that information to be able to delete it, otherwise we wouldn't be able to locate it.
publish_info missing on ControllerUnpublishVolume: On ControllerPublishVolume this CSI driver would return publish_info that is required by the ControllerUnpublishVolume.
publish_info missing on NodeUnstageVolume and NodeUnpublishVolume: When we are detaching a volume from a node we need the information of how that node was attached to make sure we do an efficient detaching and don't leave any leftovers.
NodeGetId can't return node attributes/facts: According to the CSI Specs the only way a node has to identify itself for the controller is the node_id value returned on NodeGetId. The problem is that this is too restrictive as it's a single string and its length it's only 128, which is insufficient if you have a backend that can do iSCSI, FC, and NFS and it will depend on how the controller side of the CSI driver is configured or the enabled storage array options. In that case the nodes should provide all the facts (initator name, wwpns, wwnns, ip, etc) so the Controller can determine which connection mechanism should be used and to set the right ACLs when exposing the LUN.

In summary, would it be possible to modify the CSI specs to support the following?:

Pass volume attributes to DeleteVolume
Pass publish_info to ControllerUnpublishVolume, NodeUnstargeVolume, and NodeUnpublishVolume
Allow NodeGetId to return map<string, string> node_info

Thanks in advance for your consideration,
Gorka.

Akrog · 2018-04-25T10:57:40Z

Closing the case and adding the resolution of the issue for reference.

This issue was discussed during the CSI meeting and then a more detail explanation was provided by @lpabon and @j-griffith which highlighted the conceptual differences between the CSI approach and other approaches like the contract between the Cinder service and it's drivers.

In Cinder the volume service provides a persistent metadata storage solution to all backend drivers, and drivers can rely on it to store any information that they deem necessary or convenient on method calls that support this, always within the Cinder-Driver contract restrictions. And this driver specific information will be passed on all future calls for the resource (volume, snapshot, or backup).

Whereas in the CSI case the existing metadata flows for volume attributes and publish_info is more of a convenience mechanism that a persistence mechanism meant to store driver's metadata, as it is expected that drivers can either reconstruct this data accessing the backend or are configured to use an external persistence storage (etcd, mariadb, etc.).

So given the fact that CSI has no intentions at this point to provide a reliable persistent storage solution to take care of all metadata storage needs for CSI drivers the solution fall in the CSI drivers to decide the best course of action, be it deploying a key-value storage solution or having different storage plugins that can use specific features of the CO (for example when deployed in Kubernetes it could use CRDs to store the data).

Thanks everyone for taking the time to understand the Cinder approach and explain the differences with the CSI approach.

Cheers,
Gorka.

Akrog mentioned this issue Apr 4, 2018

Add Snapshot Support in CSI Spec #207

Closed

flaper87 mentioned this issue Apr 6, 2018

Allow for the CSI cinder driver to interact with nodes not managed by OpenStack kubernetes/cloud-provider-openstack#95

Closed

Akrog closed this as completed Apr 25, 2018

msau42 mentioned this issue Jun 20, 2019

Why is VolumeAttributes not passed in csi.DeleteVolumeRequest? kubernetes-csi/external-provisioner#87

Closed

travisghansen mentioned this issue Jun 28, 2019

Secrets field in ListSnapshotsRequest #370

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Insufficient data on some RPCs #216

Insufficient data on some RPCs #216

Akrog commented Apr 4, 2018

Akrog commented Apr 25, 2018

Insufficient data on some RPCs #216

Insufficient data on some RPCs #216

Comments

Akrog commented Apr 4, 2018

Akrog commented Apr 25, 2018