Skip to content

Add volume capability to volume expansion calls #381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

gnufied
Copy link
Contributor

@gnufied gnufied commented Aug 14, 2019

Add volume capability to volume expansion calls.

cc @msau42 @wongma7 @saad-ali @davidz627

Fixes #380

@jdef
Copy link
Member

jdef commented Aug 15, 2019

It's possible that a confused CO might send along capabilities beyond the ability of the volume (and/or plugin) to support. What error code should a CO expect in such circumstances? InvalidArgument? FailedPrecondition? Aborted? How does a CO recover from such?

@gnufied
Copy link
Contributor Author

gnufied commented Aug 15, 2019

I think that - if CO sends along capabilities which are not supported by the volume or plugin, then we should return FailedPreCondition. If that happens CO should verify the capabilities by calling ValidateVolumeCapabilities and retry with right capabilities.

I will update the spec to reflect that.

csi.proto Outdated
// filesystem. If volume_capability is omitted the SP MAY determine
// access_type from given volume_path for the volume and perform
// node expansion. This is an OPTIONAL field.
VolumeCapability volume_capability = 4;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for If volume_capability is omitted the SP MAY determine access_type from given volume_path for the volume and perform node expansion, may be add some more details about how does SP determine this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking of adding some examples that SP MAY query "volume_path" and determine if given volume_path is a block device or mounted filesystem. But being too specific here requires us bring in operating system and file system specific calls (Stat), which may not be suitable for higher level specs.

// device or mounted file system. For example - if volume is being
// used as a block device the SP MAY choose to skip expanding the
// filesystem. If volume_capability is omitted the SP MAY determine
// access_type from given volume_path for the volume and perform
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If volume_capability is omitted the SP MAY determine
// access_type from given volume_path for the volume and perform
// node expansion.

I know this is MAY but it feels a little bit prescriptive. Do we want to give this suggestion if this is not something a SP "should" be doing? I feel like it's a specific workaround that may not always be possible to do from the volume_path

Copy link
Contributor Author

@gnufied gnufied Aug 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give me some examples where it may not be possible to determine access_type from given volume_path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to give this suggestion if this is not something a SP "should" be doing?

I think the only way to convey that is to convert this field from OPTIONAL to REQUIRED and MAY to MUST.

But then we will have to introduce a new node capability for being backward compatible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give me some examples where it may not be possible to determine access_type from given volume_path

I don't have any :) Ack on the backward compatibility, think thats ok

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overlaps with Ben's NodeExpand for multi-attach -- should coordinate that work with this.

// NodeExpandVolume on the node by setting node_expansion_required
// to false in ControllerExpandVolumeResponse.
// This is an OPTIONAL field.
VolumeCapability volume_capability = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A k8s PV may have multiple access modes. How do we map that to a single VolumeCapability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed in CSI meeting. Resizing is only dependent on access_type and not on access_mode when it comes to whether volume can be resized(how) and hence just having singular capability should be enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say that in the spec here. That only the access_type will be set not access_mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way VolumeCapability is defined access_type and access_mode both will be always set. It is another thing that expansion calls don't care about access_mode but there is just no way to instantiate VolumeCapability and not have access_type set on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional note.

@gnufied gnufied force-pushed the add-volume-capability-to-expansion-calls branch from d2a85cb to 96dc029 Compare October 10, 2019 20:32
@gnufied
Copy link
Contributor Author

gnufied commented Oct 10, 2019

@jdef updated the spec to return error code when mis-matching volume capabilities are specified.

// Volume capability describing how the CO intends to use this volume.
// This allows SP to determine if volume is being used as a block
// device or mounted file system. For example - if volume is being
// used as a block device the SP MAY choose to skip expanding the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like whether the NodeExpand can do work is psuedo-determined in two places

  1. here, based on the volume capability
  2. setting node_expansion_required on the response to ControllerExpand

Can you explain why you might want to set one or not the other, or when they would differ? Basically: why do we need both, can we simplify the spec to use only one of them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to NodeExpand is still made if node_expansion_required is set to true. It is upto SP to process a block device node expansion and determine how expansion should work. For example:

  1. In case of EBS or GCE-PD raw block devices, NodeExpand call might be a NO-OP and as such may never be called.
  2. For something like iSCSI raw block devices though, even if there is no file system on the device it may still require rescan of the LUN, so as OS can see updated size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is the SP that processes ControllerExpand and has Block/FS information at that level and sets the node_expansion_required. In case 1) couldn't those drivers set node_expansion_required=false and in case 2) those drivers set node_expansion_required=true. When node_expansion_required=false the NodeExpand should never get called right?

Then the NodeExpand doesn't really need the capability to decide anything anymore. I feel like maybe I'm misunderstanding something here. Could you explain why node_expansion_required is not enough of a signal for doing a NodeExpand? Is there the possibility of node_expansion_required=true and there will be different behavior for either Block or FS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there the possibility of node_expansion_required=true and there will be different behavior for either Block or FS

Yes there is! For example - an iSCSI implementation may return node_expansion_required as true in ControllerExpandVolume call for raw block devices but actual implementation of NodeExpandVolume call for iSCSI might be different between whether device is raw block or FS. For example - if device is raw block, it could just do a rescan of the LUN. But device has FS on it, it would rescan the LUN and also expand the filesystem.

We must also consider case of volume types that don't necessarily implement a control-plane volume expansion (such as ceph-csi). In that case, actual implementation of NodeExpandVolume might be different based on block or FS.

Copy link
Contributor

@davidz627 davidz627 Oct 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, this answers my question. Thanks! LGTM

@saad-ali
Copy link
Member

This will need a rebase

@gnufied gnufied force-pushed the add-volume-capability-to-expansion-calls branch from 3241315 to e4b23dc Compare October 16, 2019 20:06
@gnufied
Copy link
Contributor Author

gnufied commented Oct 16, 2019

@saad-ali rebased.

@gnufied gnufied force-pushed the add-volume-capability-to-expansion-calls branch from e4b23dc to 683b6cc Compare October 16, 2019 20:15
@saad-ali
Copy link
Member

saad-ali commented Oct 18, 2019

/lgtm
/approve

@jdef or @julian-hj or @ddebroy or @jieyu can you PTAL

Copy link
Contributor

@ddebroy ddebroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor suggestions around comments. Otherwise lgtm.

Is there a way to capture #381 (comment) is some form? It's pretty subtle but important.

csi.proto Outdated
// Volume capability describing how the CO intends to use this volume.
// This allows SP to determine if volume is being used as a block
// device or mounted file system. For example - if volume is
// being used as a block device - the SP MAY choose to skip calling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor clarification: since it's the CO that calls NodeExpandVolume, will it be clearer if the SP MAY choose to skip calling NodeExpandVolume on the node by setting node_expansion_required to false in ControllerExpandVolumeResponse. is reworded to something along the lines of the SP MAY set node_expansion_required to false in ControllerExpandVolumeResponse to skip invocation of NodeExpandVolume on the node by CO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

spec.md Outdated
@@ -1805,6 +1805,8 @@ This RPC allows the CO to expand the size of a volume.
This call MAY be made by the CO during any time in the lifecycle of the volume after creation if plugin has `VolumeExpansion.ONLINE` capability.
If plugin has `EXPAND_VOLUME` node capability, then `NodeExpandVolume` MUST be called after successful `ControllerExpandVolume` and `node_expansion_required` in `ControllerExpandVolumeResponse` is `true`.

If specified the `volume_capability` in `ControllerExpandVolumeRequest` should be same as what CO would pass in `ControllerPublishVolumeRequest`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If specified, the ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@gnufied
Copy link
Contributor Author

gnufied commented Oct 18, 2019

@ddebroy I have reworded the comment for NodeExpandVolume to provide more context and capture #381 (comment) PTAL

@ddebroy
Copy link
Contributor

ddebroy commented Oct 18, 2019

Thanks!
/lgtm

Update specs and add new error code for invalid capabilities
@gnufied gnufied force-pushed the add-volume-capability-to-expansion-calls branch from ab5b53a to 6c94335 Compare October 18, 2019 18:39
@saad-ali
Copy link
Member

Merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add volume capability for controller and node expansion requests
7 participants