[CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode #1982

aikuchin · 2024-09-10T00:04:05Z

Previously endpoints were started on NodePublish phase, this breaks hotplug of multiple disks to the same VM (causes recreation of endpoints when helpler pod chandges, and ends up in deadlock attemptiong to create new endpoint before the old wan was stopped).

This commit splits volume mount between NodeStage and NodePublish phases. So endpoint is started in NodeStage, and NodePublish just creates mount to requested pod. With this split change of hotplug pod doesn't cause endpoint stop (because it doesn't trigger NodeUnpublish) but intead results in remounts, avoiding deadlock and reconnects.

What is unclear:

~~should stage mount be implemented for other types (FS, nbd?)~~
Decided that only vhost endpoints (NBS and NFS) should use new scheme
~~how to get podId in NodeStage phase (from volume attributes?)~~
We don't use podId at all, use instead instanceId from volume attributes, if it is not provided then use old scheme
~~how to make tranfer from old scheme seamless~~
Volumes that have instanceId attrubute will use new scheme, others will continue using old scheme
~~should we use separate stage directory or just use s.socketsDir~~
Grouped volumes in instanceId directories

github-actions · 2024-09-10T00:04:15Z

Hi! Thank you for contributing!
The tests on this PR will run after a maintainer adds an ok-to-test label to this PR manually. Thank you for your patience!

Direct comparison can fail for wrapped errors, use errors.Is(err, target) instead. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

This way it is more obvious that they are unused and makes the linter happy. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

This is an impossible code path now but someday this can change and otherwise function is ready to have nil mnt, so fix one more condition. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

github-actions · 2024-09-23T14:03:26Z

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo: all tests PASSED for commit 6381d5d.

TESTS	PASSED	ERRORS	FAILED	SKIPPED	MUTED^?
6168	6168	0	0	0	0

Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

Some errors are expected or don't matter, so ignore them explicitly. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

This will be required later to add staged mount for vhost endpoints Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

Volume context and capabilities should be reused in Stage and Publish Move most of other constants from arguments to variables. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

This will be used later when part of NodePublish logic for vhost endpoints will be moved to NodeStage. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

Before we assumed that pod ID is always present and can be used as a part of mount path, but if endpoint is started in NodePublish podId is not known. For such cases return staging directory name where endpoint will be started. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

cloud/blockstore/tools/csi_driver/internal/driver/node.go

cloud/blockstore/tools/csi_driver/internal/driver/node_test.go

…anceId attribute Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

Also rename all socketDir variables to endpointDir for conststency. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

In most callsites we already know endpointDir, so just accept it as an argument instead of rebuilding every time. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

This keeps apart codepaths that require podId separate from ones that rely on instanceID. It also allows to remove temporary hack from getInstanceOrPodId and rename it back to getPodId. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

This way instanceId variable can be used only if it is not empty and error handling happens only if any of stage functions was called. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

We will need these parameters later to properly stop endpoints during NodeUnstage. This also allows to skip unnecessary calls in unstage of old volumes. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

We had all staged disk on the node to be in shared staging directory, but now we have all necessary information in stage record to get instanceId in NodeUnstage so we can group disks in instance directory. Old path looked like: /...NBS.../sockets/${POD_ID}/${VOLUME_ID}/[nbs,nfs].sock Intermediate path looked like: /...NBS.../sockets/staging/${VOLUME_ID}/[nbs,nfs].sock New path will look like: /...NBS.../sockets/${INSTANCE_ID}/${VOLUME_ID}/[nbs,nfs].sock Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

According to spec target directory must be removed after NodeUnstage but now mounter mock doesn't delete it during mount point cleanup. If directory deletion is enabled then other test breaks because it relies on previous tests leaving temoprary directories. This will be fixed separately form this PR. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

We use the same code for all types of volumes in NodeUnpublish because we have no good way to get their type. Add comment to explain why StopEndpoint calls in NodeUnpublishVolume has no effect for staged disks. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

github-actions · 2024-09-27T11:38:34Z

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo: all tests PASSED for commit e7c8ad8.

TESTS	PASSED	ERRORS	FAILED	SKIPPED	MUTED^?
3463	3463	0	0	0	0

aikuchin added 3 commits September 23, 2024 05:38

cleanup: fix error comparison

386ab7e

Direct comparison can fail for wrapped errors, use errors.Is(err, target) instead. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

cleanup: Removed unused function arguments

8d3b571

This way it is more obvious that they are unused and makes the linter happy. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

cleanup: Fix potential nil pointer dereference

f351ec0

This is an impossible code path now but someday this can change and otherwise function is ready to have nil mnt, so fix one more condition. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

aikuchin force-pushed the users/antonkuchin/csi-stage-volumes branch 2 times, most recently from 2271f27 to 6381d5d Compare September 23, 2024 05:09

aikuchin changed the title ~~[RFC] [CSI driver] PoC of staged mount for vhost block volumes in VM mode~~ [CSI driver] PoC of staged mount for vhost block volumes in VM mode Sep 23, 2024

aikuchin marked this pull request as ready for review September 23, 2024 05:22

qkrorlqr added ok-to-test Label to approve test launch for external members large-tests Launch large tests for PR labels Sep 23, 2024

github-actions bot removed the ok-to-test Label to approve test launch for external members label Sep 23, 2024

antonmyagkov self-requested a review September 23, 2024 10:58

aikuchin changed the title ~~[CSI driver] PoC of staged mount for vhost block volumes in VM mode~~ [CSI driver] Imlementation of staged mount for vhost block volumes in VM mode Sep 23, 2024

aikuchin changed the title ~~[CSI driver] Imlementation of staged mount for vhost block volumes in VM mode~~ [CSI driver] Implementation of staged mount for vhost block volumes in VM mode Sep 23, 2024

aikuchin changed the title ~~[CSI driver] Implementation of staged mount for vhost block volumes in VM mode~~ [CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode Sep 23, 2024

aikuchin added 7 commits September 25, 2024 18:57

cleanup: Use import alias to avoid name collision with variable

4b0d3ec

Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

cleanup: Explicitly ignore errors that doesn't matter

f6e5fb8

Some errors are expected or don't matter, so ignore them explicitly. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

cleanup: Extract disk.img file creation in separate function

8c4ab66

Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

CSI staged mount: Separate start endpoint for vhost and NBD

76aa3b2

This will be required later to add staged mount for vhost endpoints Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

csi tests: extract values from test calls to variables

f06fa6b

Volume context and capabilities should be reused in Stage and Publish Move most of other constants from arguments to variables. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

staged csi: move dummy img file craation from mount to callers

6f3878a

This will be used later when part of NodePublish logic for vhost endpoints will be moved to NodeStage. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

aikuchin force-pushed the users/antonkuchin/csi-stage-volumes branch from 6381d5d to aa98a8f Compare September 25, 2024 17:53

antonmyagkov mentioned this pull request Sep 25, 2024

test csi driver in vm mode #2132

Merged

antonmyagkov reviewed Sep 26, 2024

View reviewed changes

cloud/blockstore/tools/csi_driver/internal/driver/node.go Show resolved Hide resolved

antonmyagkov reviewed Sep 26, 2024

View reviewed changes

cloud/blockstore/tools/csi_driver/internal/driver/node.go Show resolved Hide resolved

cloud/blockstore/tools/csi_driver/internal/driver/node.go Outdated Show resolved Hide resolved

tpashkin reviewed Sep 26, 2024

View reviewed changes

cloud/blockstore/tools/csi_driver/internal/driver/node.go Outdated Show resolved Hide resolved

cloud/blockstore/tools/csi_driver/internal/driver/node.go Outdated Show resolved Hide resolved

antonmyagkov reviewed Sep 27, 2024

View reviewed changes

cloud/blockstore/tools/csi_driver/internal/driver/node_test.go Outdated Show resolved Hide resolved

staged csi: split endpoint start and mount for volumes that have inst…

d93baa5

…anceId attribute Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

aikuchin added 6 commits September 27, 2024 11:10

staged csi: extract staging path construction to function

6feb1e1

Also rename all socketDir variables to endpointDir for conststency. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

staged csi: remove endpointDir construction from mount function

e9c5faf

In most callsites we already know endpointDir, so just accept it as an argument instead of rebuilding every time. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

staged csi: make NodeStageVolume logic a little more explicit

34e63f1

This way instanceId variable can be used only if it is not empty and error handling happens only if any of stage functions was called. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

staged csi: save volume parameters as json file in stagingTargetPath

f134a8a

We will need these parameters later to properly stop endpoints during NodeUnstage. This also allows to skip unnecessary calls in unstage of old volumes. Signed-off-by: Anton Kuchin <antonkuchin@nebius.com>

aikuchin force-pushed the users/antonkuchin/csi-stage-volumes branch from aa98a8f to fc02929 Compare September 27, 2024 09:16

aikuchin added 2 commits September 27, 2024 11:25

aikuchin force-pushed the users/antonkuchin/csi-stage-volumes branch from fc02929 to e7c8ad8 Compare September 27, 2024 09:26

antonmyagkov self-requested a review September 27, 2024 09:33

antonmyagkov approved these changes Sep 27, 2024

View reviewed changes

tpashkin added blockstore Add this label to run only cloud/blockstore build and tests on PR ok-to-test Label to approve test launch for external members labels Sep 27, 2024

github-actions bot removed the ok-to-test Label to approve test launch for external members label Sep 27, 2024

antonmyagkov requested a review from drbasic September 27, 2024 14:08

drbasic approved these changes Sep 30, 2024

View reviewed changes

qkrorlqr merged commit f4ac3a9 into ydb-platform:main Sep 30, 2024
24 of 28 checks passed

aikuchin deleted the users/antonkuchin/csi-stage-volumes branch September 30, 2024 11:02

antonmyagkov mentioned this pull request Oct 11, 2024

Implement (un)stage/(un)publish volume according csi spec for mount mode #2195

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode #1982

[CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode #1982

aikuchin commented Sep 10, 2024 •

edited

Loading

github-actions bot commented Sep 10, 2024

github-actions bot commented Sep 23, 2024

github-actions bot commented Sep 27, 2024

[CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode #1982

[CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode #1982

Conversation

aikuchin commented Sep 10, 2024 • edited Loading

github-actions bot commented Sep 10, 2024

github-actions bot commented Sep 23, 2024

github-actions bot commented Sep 27, 2024

aikuchin commented Sep 10, 2024 •

edited

Loading