-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hostpath: Add block volume support #6
hostpath: Add block volume support #6
Conversation
I tried this with canary image of csi-provisioner and the
The same was also listed in the pod description:
The publish target path for block devices are at |
pkg/hostpath/controllerserver.go
Outdated
of := fmt.Sprintf("%s=%s", "of", path) | ||
count := fmt.Sprintf("%s=%d", "count", capacity/mib) | ||
// Create a block file. | ||
out, err := executor.Command("dd", "if=/dev/zero", of, "bs=1M", count).CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use fallocate [1] instead of dd? It's way faster.
pkg/hostpath/nodeserver.go
Outdated
|
||
executor := utilexec.New() | ||
// Get a free loop device. | ||
out, err := executor.Command("losetup", "-f").CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two node publish calls can race here and choose the same loop device. It's possible to remove those invocation of losetup and just pass the -f flag to the invocation below, then also pass --show to find out which device it chose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In k/k, makeLoopDevice
is doing the way that @bswartz mentioned. So, it would be worth checking this code.
pkg/hostpath/nodeserver.go
Outdated
losetupArgs = append(losetupArgs, loopDevice, vol.VolPath) | ||
|
||
// Associate block file with the loop device. | ||
out, err = executor.Command("losetup", losetupArgs...).CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not idempotent. If you get killed between the call to "losetup" and "ln", when you restart and come back into this method, you'll create another loop device, leaking the old one.
pkg/hostpath/nodeserver.go
Outdated
} | ||
|
||
// Create symlink to the target path. | ||
out, err = executor.Command("ln", "-s", loopDevice, targetPath).CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need a check higher up in this function to see if the symlink already exists and points to the right place, for idempotency. Otherwise you'll end up in a failure loop when the link already exists because "ln" without "-f" won't overwrite existing symlinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, for idempotency, CSI community recommends to use bind mount of device, instead of symlink. If you use bind mount, your can check if it is already done, by checking if file already exists in targetPath and if the loopDevice is already mounted to targetPath.
pkg/hostpath/nodeserver.go
Outdated
switch vol.VolAccessType { | ||
case blockAccess: | ||
// Remove the symlink. | ||
os.RemoveAll(targetPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on adding block volume support codes to csi hostpath driver.
It's a good idea to make loopback device visible to pod as a block volume for drivers that originally provide storage as filesystem, not block volume.
I'm adding some comments.
pkg/hostpath/nodeserver.go
Outdated
|
||
executor := utilexec.New() | ||
// Get a free loop device. | ||
out, err := executor.Command("losetup", "-f").CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In k/k, makeLoopDevice
is doing the way that @bswartz mentioned. So, it would be worth checking this code.
pkg/hostpath/nodeserver.go
Outdated
} | ||
|
||
// Create symlink to the target path. | ||
out, err = executor.Command("ln", "-s", loopDevice, targetPath).CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, for idempotency, CSI community recommends to use bind mount of device, instead of symlink. If you use bind mount, your can check if it is already done, by checking if file already exists in targetPath and if the loopDevice is already mounted to targetPath.
BTW I understand that the host path driver isn't meant for production and is just for testing, so my comments about idempotency are just suggestions. Given the low likelihood that non-idempotency will cause any actual issues during testing, I wouldn't spend a ton of time on that. Still there are a few sections of code that could be improved easily. See mkimuram's comments. |
1a9ef86
to
083b8fc
Compare
Hi, I've added some more commits addressing the previous reviews. Keeping the old commits for now, in case I need to refer back. Will squash all the commits once the changes are final. I moved the loop device creation from NodePublish to CreateVolume and also moved the corresponding loop device deletion from NodeUnpublish to DeleteVolume. Now NodePublish just checks the symlink targets of a given target path and creates symlink if needed. And NodeUnpublish deletes the symlinks. The volume ID is set to the loop device base name, avoiding the need to maintain any volume info in memory. Regarding the usage of Regarding using bind mount (#6 (comment)), I'm a little confused about that. Trying to bind mount a loop device associated with a block device fails unless the block device has a filesystem. AFAIK, we are trying to provide raw block here. Should we format it and then bind mount? Also, I've added a security context to run the plugin as privileged in the plugin deployment manifest. Without this, loop device creation was failing with file not found error, because it wasn't shared with the host. |
The version of |
Yeah, I just tried doing |
When you bind mount a file, the destination must be an existing file. There's no need for a filesystem. |
083b8fc
to
267cbdc
Compare
This change adds block volume support to hostpath driver. When a block volume request is received, a block file is created at provisionRoot with the requested capacity as size and a loop device is created associated with the block file. At node publish, a bind mount of the loop device is created at the publish target path. At node unpublish, the target path is unmounted and deleted. At volume delete, loop device is disassociated and the block file is deleted. Add plugins-dir to hostpath plugin daemonset The volume publish target path for block devices are usually under /var/lib/kubelet/plugins directory. Hence, adding plugins directory to the pod volumes with bidirectional mount propagation. Run the plugin as privileged to use loop devices In order to share loop devices with the host, the plugin container must be run as a privileged container.
267cbdc
to
ea76c5b
Compare
Block support works with bind mount now :) I made some changes to how the statelessness was implemented. Instead of keeping the loop device name in volume ID, it is now queried through the updated version of losetup ( Added new files in vendor because I've used the functions available in |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: darkowlzz, msau42 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This change adds block volume support to hostpath driver.
When a block volume request is received, a block file is created at
provisionRoot with the requested capacity as size and a loop device is
created associated with the block file.
At node publish, a bind mount of the loop device is created at the
publish target path.
At node unpublish, the target path is unmounted and deleted.
At volume delete, loop device is disassociated and the block file is
deleted.
Add plugins-dir to hostpath plugin daemonset
The volume publish target path for block devices are usually under
/var/lib/kubelet/plugins directory. Hence, adding plugins directory to
the pod volumes with bidirectional mount propagation.
Run the plugin as privileged to use loop devices
In order to share loop devices with the host, the plugin container must
be run as a privileged container.