Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Add support for online disk growing of zfs and lvm block volumes (from Incus) #14211

Conversation

kadinsayani
Copy link
Contributor

@kadinsayani kadinsayani commented Oct 4, 2024

This PR adds support for resizing (growing) VM disks without rebooting, when using ZFS or LVM storage backends.

Resolves #13311.

@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from c32cf54 to 09b040b Compare October 7, 2024 17:25
@kadinsayani kadinsayani marked this pull request as ready for review October 7, 2024 18:20
@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from 5141575 to 4646696 Compare October 7, 2024 20:56
@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from 4646696 to 39f4b7a Compare October 7, 2024 21:03
@github-actions github-actions bot added Documentation Documentation needs updating API Changes to the REST API labels Oct 7, 2024
Copy link

github-actions bot commented Oct 7, 2024

Heads up @mionaalex - the "Documentation" label was applied to this issue.

@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch 2 times, most recently from bef2c1b to eb1b3a6 Compare October 8, 2024 02:03
@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from eb1b3a6 to 41533d9 Compare October 8, 2024 02:19
@simondeziel
Copy link
Member

What would prevent growing live the .raw file backing a QEMU on another storage driver? Or maybe that was left for another day/PR?

@tomponline
Copy link
Member

What would prevent growing live the .raw file backing a QEMU on another storage driver? Or maybe that was left for another day/PR?

I'd like it if we could explore adding suppport for that, we support growing the raw disk file offline, so not sure if there is a reason we cant do it online?

@tomponline
Copy link
Member

Needs a rebase too please

@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from 41533d9 to 877ed7d Compare October 9, 2024 16:16
@kadinsayani
Copy link
Contributor Author

@simondeziel @tomponline re: online disk resize

I don't see an issue with adding online disk resizing for ceph. RBD has an exclusive lock feature and supports online resizing with RBD client kernel > 3.10.

@simondeziel
Copy link
Member

Thanks for checking on ceph RBD live resize capabilities! As Tom mentioned, we can already grow plain .raw file while offline so maybe we could do that live too now that there's a mechanism to notify QEMU about the bigger backing file.

@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from 877ed7d to b3f97a5 Compare October 14, 2024 21:50
@kadinsayani
Copy link
Contributor Author

@tomponline rebased and good to go. Do we want to include support for live resizing ceph disks with this PR or open up a separate issue and save it for later?

@tomponline
Copy link
Member

@tomponline rebased and good to go. Do we want to include support for live resizing ceph disks with this PR or open up a separate issue and save it for later?

Lets try and do it as part of this PR. And then we can add a single API extension.

@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch 2 times, most recently from 257be7c to 86b38f5 Compare October 22, 2024 21:24
@kadinsayani
Copy link
Contributor Author

kadinsayani commented Nov 6, 2024

I've tested live resizing a Ceph RBD filesystem disk and it works as expected - it's just online resizing of Ceph RBD block volumes that doesn't work, which explains why I haven't been able to resize a Ceph backed rootfs.

@kadinsayani
Copy link
Contributor Author

kadinsayani commented Nov 6, 2024

It doesn't look like we'll be able to add support for online growing of Ceph RBD root disks. Ceph backed VM's have a read only snapshot which can't be updated when the root disk size is updated (see below). The snapshot is used for instance creation.

// Block image volumes cannot be resized because they have a readonly snapshot that doesn't get
// updated when the volume's size is changed, and this is what instances are created from.
// During initial volume fill allowUnsafeResize is enabled because snapshot hasn't been taken yet.
if !allowUnsafeResize && vol.volType == VolumeTypeImage {
return ErrNotSupported
}

Furthermore, online resizing for Ceph volumes is generally considered unsafe in LXD:

allowUnsafeResize := false
if vol.volType == VolumeTypeImage {
// Allow filler to resize initial image volume as needed.
// Some storage drivers don't normally allow image volumes to be resized due to
// them having read-only snapshots that cannot be resized. However when creating
// the initial image volume and filling it before the snapshot is taken resizing
// can be allowed and is required in order to support unpacking images larger than
// the default volume size. The filler function is still expected to obey any
// volume size restrictions configured on the pool.
// Unsafe resize is also needed to disable filesystem resize safety checks.
// This is safe because if for some reason an error occurs the volume will be
// discarded rather than leaving a corrupt filesystem.
allowUnsafeResize = true
}

@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from 86b38f5 to 512f8d6 Compare November 6, 2024 22:35
@kadinsayani
Copy link
Contributor Author

kadinsayani commented Nov 6, 2024

Rebased and good to go.

In summary, we're adding support for online resizing (growing) of any zfs or lvm disks. Online resizing Ceph RBD filesystems was possible before the changes in this PR, but we've confirmed that online resizing of Ceph RBD block volumes is not possible due to the read only snapshot used during instance creation.

lxd/api_internal.go Outdated Show resolved Hide resolved
lxd/storage/drivers/driver_lvm_utils.go Outdated Show resolved Hide resolved
@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from 50c1dcd to e5e9a6e Compare November 7, 2024 14:58
@kadinsayani
Copy link
Contributor Author

zvols have a similar read-only snapshot as their origin, I guess it's an inherent limitation of how CoW is implemented in Ceph. Thanks for digging into it.

https://docs.ceph.com/en/reef/rbd/rbd-snapshot/#layering seems to suggest it should just work:

A copy-on-write clone of a snapshot behaves exactly like any other Ceph block device image. You can read to, write from, clone, and resize cloned images. There are no special restrictions with cloned images.

But since you ran into issues, maybe we need to flatten those cloned images before growing them? https://docs.ceph.com/en/reef/rbd/rbd-snapshot/#flattening-a-cloned-image

Thanks for digging into this further :)

Given my initial research, your new findings, and what I've seen in the LXD codebase, I believe it is theoretically possible to online resize (grow) Ceph RBD block volumes, dir and .raw files.

I think I have some more work to do for this PR.

@kadinsayani
Copy link
Contributor Author

But since you ran into issues, maybe we need to flatten those cloned images before growing them? https://docs.ceph.com/en/reef/rbd/rbd-snapshot/#flattening-a-cloned-image

I don't think flattening the cloned image is a safe approach. From the docs:

Since a flattened image contains all the data stored in the snapshot, a flattened image takes up more storage space than a layered clone does.

@kadinsayani
Copy link
Contributor Author

So although it is possible to online grow a Ceph RBD backed root disk, I found another problem:

When we create a Ceph RBD volume, a read only snapshot is created. This read only snapshot is used as the clone source for future non-image volumes. The read only or protected property of the snapshot is a precondition for creating RBD clones.

@simondeziel
Copy link
Member

When we create a Ceph RBD volume, a read only snapshot is created. This read only snapshot is used as the clone source for future non-image volumes. The read only or protected property of the snapshot is a precondition for creating RBD clones.

That's initial image turned into a cloned read only snapshot really maps to my understanding of how it works with ZFS. Still not clear why/what's different with Ceph RBD volumes :/

@kadinsayani
Copy link
Contributor Author

For reference, here is the error I'm getting after modifying the behaviour to allow for online growing the root disk, and adding a file system resize:

root@testbox:~# lxc config device set v1 root size=11GiB
Error: Failed to update device "root": Could not grow underlying "ext4" filesystem for "/dev/rbd0": Failed to run: resize2fs /dev/rbd0: exit status 1 (resize2fs 1.47.0 (5-Feb-2023)
resize2fs: Bad magic number in super-block while trying to open /dev/rbd0)

@simondeziel
Copy link
Member

For reference, here is the error I'm getting after modifying the behaviour to allow for online growing the root disk, and adding a file system resize:

root@testbox:~# lxc config device set v1 root size=11GiB
Error: Failed to update device "root": Could not grow underlying "ext4" filesystem for "/dev/rbd0": Failed to run: resize2fs /dev/rbd0: exit status 1 (resize2fs 1.47.0 (5-Feb-2023)
resize2fs: Bad magic number in super-block while trying to open /dev/rbd0)

underlying "ext4" seems misleading as it seems to be code running in the host itself as operating on /dev/rbd0. Also, why would it do that? I'd expect only the VM's /dev/sda to be bigger, no partition touched, no FS resized.

Same for /dev/rbd0, shouldn't it just be bigger?

@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch 3 times, most recently from e57a3d1 to 319bf37 Compare November 13, 2024 20:32
@kadinsayani kadinsayani changed the title Support VM disk resize without reboot (from Incus) Support VM disk resize (zfs and lvm) without reboot (from Incus) Nov 13, 2024
@kadinsayani
Copy link
Contributor Author

I've updated the PR and the tests are good to go. I've opened a new issue to track adding support for Ceph RBD volumes.

@simondeziel
Copy link
Member

I've updated the PR and the tests are good to go. I've opened a new issue to track adding support for Ceph RBD volumes.

I don't mind (too much) having this feature land in a per-driver fashion. However, I suspect/hope that Ceph is the special case here and all our other drivers would support live growing. I didn't hear back from you regarding the easy to test dir backend?
Next, we'll need to consider Powerflex and the other driver that's still baking.

stgraber and others added 4 commits November 15, 2024 10:07
…ple servers

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
(cherry picked from commit 73a78c2f0cc188c602c88be8cfdc9bfcfb9df0ab)
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
License: Apache-2.0
…esize

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
(cherry picked from commit 81f9c4b915830322871bb49d6f04f3009f63d01a)
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
License: Apache-2.0
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
@kadinsayani kadinsayani force-pushed the 13311-support-growing-vm-disk-size-without-reboot branch from 319bf37 to 4e0e1c0 Compare November 15, 2024 17:07
@kadinsayani
Copy link
Contributor Author

I've updated the PR and the tests are good to go. I've opened a new issue to track adding support for Ceph RBD volumes.

I don't mind (too much) having this feature land in a per-driver fashion. However, I suspect/hope that Ceph is the special case here and all our other drivers would support live growing. I didn't hear back from you regarding the easy to test dir backend? Next, we'll need to consider Powerflex and the other driver that's still baking.

dir is not supported with the changes in this PR thus far. I'm working on adding support for it :)

@tomponline mentioned that Powerflex is out of scope for this PR.

@hamistao
Copy link
Contributor

@kadinsayani From what I can see this may also help with container live resizing (for both growing and shrinking) on block based drivers (i.e. lvm, ceph and zfs with volumes.zfs.block_mode enabled), as it currently is also not possible. I am also assuming this would not apply to ceph for the same reason we apparentely can't resize VMs on it. To what extent are these assumptions correct?

@kadinsayani
Copy link
Contributor Author

kadinsayani commented Nov 26, 2024

@kadinsayani From what I can see this may also help with container live resizing (for both growing and shrinking) on block based drivers (i.e. lvm, ceph and zfs with volumes.zfs.block_mode enabled), as it currently is also not possible. I am also assuming this would not apply to ceph for the same reason we apparentely can't resize VMs on it. To what extent are these assumptions correct?

Online shrinking is only possible for filesystem volumes. Online growing of block based drivers (zfs and lvm) will be possible for containers once this PR is merged (with volumes.zfs.block_mode enabled). Online growing of Ceph RBD block volumes is still under investigation, see #14462.

@kadinsayani kadinsayani changed the title Support VM disk resize (zfs and lvm) without reboot (from Incus) Storage: support disk resize (zfs and lvm) without reboot (from Incus) Nov 26, 2024
@kadinsayani kadinsayani changed the title Storage: support disk resize (zfs and lvm) without reboot (from Incus) Storage: Add support for online disk growing of zfs and lvm block volumes (from Incus) Nov 26, 2024
@kadinsayani kadinsayani changed the title Storage: Add support for online disk growing of zfs and lvm block volumes (from Incus) Storage: Add support for online disk growing of zfs and lvm block volumes (from Incus) Nov 26, 2024
@kadinsayani kadinsayani marked this pull request as draft November 29, 2024 15:44
@tomponline
Copy link
Member

@kadinsayani can we close this for now until you get chance to look at this again?

@kadinsayani kadinsayani closed this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes to the REST API Documentation Documentation needs updating
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support growing the VM disk size without needing to reboot the VM
6 participants