Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: CephCSI cannot determine correct clone depth in certain case #4013

Open
Rakshith-R opened this issue Jul 24, 2023 · 13 comments · May be fixed by #4029
Open

rbd: CephCSI cannot determine correct clone depth in certain case #4013

Rakshith-R opened this issue Jul 24, 2023 · 13 comments · May be fixed by #4029
Assignees
Labels
bug Something isn't working component/rbd Issues related to RBD dependency/go-ceph depends on go-ceph functionality keepalive This label can be used to disable stale bot activiity in the repo

Comments

@Rakshith-R
Copy link
Contributor

Describe the bug

If parent PVC of snapshot/ restore/clone pvc is deleted,
then

func (ri *rbdImage) getCloneDepth(ctx context.Context) (uint, error) {

will not work as intended.
Therefore, the clone depth returned is not correct since this function requires all parent images in chain to be present in cluster(deleted parent images will be trash).

One side affect is the issue described here rook/rook#12312.
Creating a chain of clone/snapshot+restore and deleting parent snapshot and PVC renders the final child PVC
to be unmountable.

Environment details

  • Image/version of Ceph CSI driver : All supported cephcsi versions.
  • Helm chart version : -
  • Kernel version : -
  • Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its
    krbd or rbd-nbd) : krbd
  • Kubernetes cluster version : all
  • Ceph cluster version : all

Steps to reproduce

Steps to reproduce the behavior:

  1. Create a chain of clones/ snapshot+restores
  2. Delete parent snapshot/clone immediately after child pvc/snapshot creation
  3. Try to mount child PVC to a pod.

Actual results

rbd map fails with the below error:

E0531 18:09:09.891563 15892 utils.go:210] ID: 198 Req-ID: 0001-0009-rook-ceph-0000000000000001-8b978541-8495-4c20-bcab-0a42fa927b5a GRPC error: rpc error: code = Internal desc = rbd: map failed with error an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 10.110.0.127:6789,10.109.223.145:6789,10.108.43.136:6789 --keyfile=***stripped*** map replicapool/csi-vol-8b978541-8495-4c20-bcab-0a42fa927b5a --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed rbd: map failed: (22) Invalid argument

Expected behavior

No error.

Workaround

Flatten the child PVC image manually.

Possible Solution

From CephCSI point of view, we have no other to determine clone depth.
We need a API change possibly in rbd image info providing us with clone depth from ceph.

cc @ceph/ceph-csi-contributors

@Rakshith-R Rakshith-R added bug Something isn't working component/rbd Issues related to RBD labels Jul 24, 2023
@Rakshith-R
Copy link
Contributor Author

Steps to reproduce:

  1. Create [Restore]PVC
  2. Create Snapshot
  3. Delete parent PVC and restore PVC
  4. Delete snapshot
  5. Repeat step 2 (untill there is 17 images in trash rbd trash ls <pool_name>| wc -l)
  6. Mount child PVC to a pod

Nodeplugin logs:

I0724 12:32:42.749654    7461 utils.go:195] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 GRPC call: /csi.v1.Node/NodeStageVolume
I0724 12:32:42.749810    7461 utils.go:206] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/86f184afd78d2e10464567a1eaa6d77fd2b52867f9c4f76350333f39fc3dc557/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":7}},"volume_context":{"clusterID":"rook-ceph","imageFeatures":"layering","imageFormat":"2","imageName":"csi-vol-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6","journalPool":"replicapool","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1690201077472-8271-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6"}
I0724 12:32:42.750671    7461 omap.go:88] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 got omap values: (pool="replicapool", namespace="", name="csi.volume.c36ef26f-d656-4f3b-9b12-d54b9d9c98a6"): map[csi.imageid:12f02815b4b2 csi.imagename:csi-vol-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 csi.volname:pvc-475deef4-7f81-4750-b75f-e1e1b07a538f csi.volume.owner:rook-ceph]
I0724 12:32:43.170381    7461 rbd_util.go:352] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 checking for ImageFeatures: [layering operations]
I0724 12:32:43.198124    7461 cephcmds.go:105] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 command succeeded: rbd [device list --format=json --device-type krbd]
I0724 12:32:43.353638    7461 rbd_attach.go:419] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 rbd: map mon 192.168.39.232:6789
I0724 12:32:43.669135    7461 cephcmds.go:98] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 192.168.39.232:6789 --keyfile=***stripped*** map replicapool/csi-vol-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 --device-type krbd --options noudev]
W0724 12:32:43.669173    7461 rbd_attach.go:468] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 rbd: map error an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 192.168.39.232:6789 --keyfile=***stripped*** map replicapool/csi-vol-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 --device-type krbd --options noudev], rbd output: rbd: sysfs write failed
rbd: map failed: (22) Invalid argument
E0724 12:32:43.669309    7461 utils.go:210] ID: 90 Req-ID: 0001-0009-rook-ceph-0000000000000001-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 GRPC error: rpc error: code = Internal desc = rbd: map failed with error an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 192.168.39.232:6789 --keyfile=***stripped*** map replicapool/csi-vol-c36ef26f-d656-4f3b-9b12-d54b9d9c98a6 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (22) Invalid argument

Dmesg logs :
dmesg.txt

images in trash:

bash-4.4$ rbd trash ls replicapool | wc -l
18

cc @ceph/ceph-csi-contributors

@idryomov
Can you please provide your inputs on this issue?

@idryomov
Copy link
Contributor

@idryomov
Can you please provide your inputs on this issue?

The kernel client is failing to map an image because it has more than 16 images in the parent chain, most or all of which are in the trash:

[ 3327.965645] rbd: id 12f02b4046bf: unable to get image name
[ 3327.966827] rbd: id 12f041960b82: unable to get image name
[ 3327.968113] rbd: id 12f05858ad42: unable to get image name
[ 3327.969280] rbd: id 12f0c68079be: unable to get image name
[ 3327.970612] rbd: id 12f0cbf47a6: unable to get image name
[ 3327.972276] rbd: id 12f0193d54ed: unable to get image name
[ 3327.973473] rbd: id 115e71c81440: unable to get image name
[ 3327.974756] rbd: id 115e5c165e0c: unable to get image name
[ 3327.976137] rbd: id 115ef49affa9: unable to get image name
[ 3327.977415] rbd: id 115efd7ba12b: unable to get image name
[ 3327.978439] rbd: id 115e63c98b1a: unable to get image name
[ 3327.979750] rbd: id 115e1f145b8: unable to get image name
[ 3327.981078] rbd: id 115e919877e0: unable to get image name
[ 3327.982162] rbd: id 115e2403b185: unable to get image name
[ 3327.983619] rbd: id 115e5a5abd49: unable to get image name
[ 3327.984825] rbd: id 115e3d321ae6: unable to get image name
[ 3327.985458] rbd: parent chain is too long (17)

For further inputs, please translate the steps to reproduce to rbd commands. While doing that, you would likely see how/where the parent images build-up occurs.

@Rakshith-R
Copy link
Contributor Author

@idryomov
Can you please provide your inputs on this issue?

The kernel client is failing to map an image because it has more than 16 images in the parent chain, most or all of which are in the trash:

Yup, the images being in trash is expected

For further inputs, please translate the steps to reproduce to rbd commands. While doing that, you would likely see how/where the parent images build-up occurs.

Equivalent rbd commands can be found here https://github.com/ceph/ceph-csi/blob/devel/docs/design/proposals/rbd-snap-clone.md.
The build is happening because CephCSI cannot determine the clone chain depth appropriately and therefore is not [adding task to/] flattening the image.

In short,

  • CephCSI moves deleted images to trash and adds a task to remove it.
  • CephCSI determines clone chain depth by traversing the image chain
    func (ri *rbdImage) getCloneDepth(ctx context.Context) (uint, error) {
    • This method fails if a parent PVC/Snapshot is deleted and image is in trash
    • Each kubernetes PVC and snapshot are meant to be independent of parent or child pvc/snapshots. Therefore, we cannot impose any restriction here.
    • We need a way to determine the chain depth in another way, maybe a field in rbd image info ?

@idryomov
Copy link
Contributor

I see. The way getCloneDepth() just returns current depth when it encounters an empty image name looks wrong, as is bailing on ErrImageNotFound error.

When getting parent details via rbd_get_parent() API, the returned rbd_linked_image_spec_t struct has trash and image_id fields. If trash true, it means the parent image is in the trash. In that case, the image ID from image_id can be used to open the parent image with rbd_open_by_id() API, thus moving on to the next iteration.

@Rakshith-R Rakshith-R added this to the release-v3.9.1 milestone Jul 27, 2023
@Rakshith-R Rakshith-R added the dependency/ceph depends on core Ceph functionality label Jul 27, 2023
@Rakshith-R Rakshith-R added dependency/go-ceph depends on go-ceph functionality and removed dependency/ceph depends on core Ceph functionality labels Jul 27, 2023
@Rakshith-R
Copy link
Contributor Author

I see. The way getCloneDepth() just returns current depth when it encounters an empty image name looks wrong, as is bailing on ErrImageNotFound error.

When getting parent details via rbd_get_parent() API, the returned rbd_linked_image_spec_t struct has trash and image_id fields. If trash true, it means the parent image is in the trash. In that case, the image ID from image_id can be used to open the parent image with rbd_open_by_id() API, thus moving on to the next iteration.

Thanks !,

Currently go-ceph is not reading and exporting those details,
https://github.com/ceph/go-ceph/blob/ce4031e218edce2afdc79a714b7123a74b1e3c78/rbd/snapshot_nautilus.go#L106-L109

We'll need to add it there before using it at cephcsi

nixpanic added a commit to nixpanic/go-ceph that referenced this issue Jul 31, 2023
When a parent image has been removed, it will linger in the trash until
all siblings are gone. The image is not accessible through it's name
anymore, only through its ID.

The ImageSpec that is returned by Image.GetParent() now contains the
Trash boolean and the ImageID to identify if the image is in the trash,
and use OpenImageById() to access the removed parent image.

Related-to: ceph/ceph-csi#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/go-ceph that referenced this issue Aug 1, 2023
When a parent image has been removed, it will linger in the trash until
all siblings are gone. The image is not accessible through it's name
anymore, only through its ID.

The ImageSpec that is returned by Image.GetParent() now contains the
Trash boolean and the ImageID to identify if the image is in the trash,
and use OpenImageById() to access the removed parent image.

Related-to: ceph/ceph-csi#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/go-ceph that referenced this issue Aug 2, 2023
When a parent image has been removed, it will linger in the trash until
all siblings are gone. The image is not accessible through it's name
anymore, only through its ID.

The ImageSpec that is returned by Image.GetParent() now contains the
Trash boolean and the ImageID to identify if the image is in the trash,
and use OpenImageById() to access the removed parent image.

Related-to: ceph/ceph-csi#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
@nixpanic nixpanic self-assigned this Aug 2, 2023
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Aug 2, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
mergify bot pushed a commit to ceph/go-ceph that referenced this issue Aug 2, 2023
When a parent image has been removed, it will linger in the trash until
all siblings are gone. The image is not accessible through it's name
anymore, only through its ID.

The ImageSpec that is returned by Image.GetParent() now contains the
Trash boolean and the ImageID to identify if the image is in the trash,
and use OpenImageById() to access the removed parent image.

Related-to: ceph/ceph-csi#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Aug 22, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Aug 22, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Aug 23, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Aug 30, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
@github-actions
Copy link

github-actions bot commented Sep 1, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Sep 1, 2023
@Rakshith-R Rakshith-R removed the wontfix This will not be worked on label Sep 4, 2023
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Sep 4, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Sep 6, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
@github-actions
Copy link

github-actions bot commented Oct 4, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Oct 4, 2023
@Rakshith-R Rakshith-R removed the wontfix This will not be worked on label Oct 5, 2023
Copy link

github-actions bot commented Nov 4, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Nov 4, 2023
Copy link

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 12, 2023
@Rakshith-R Rakshith-R reopened this Nov 13, 2023
@Rakshith-R Rakshith-R removed the wontfix This will not be worked on label Nov 13, 2023
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Nov 13, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
@nixpanic nixpanic removed this from the release-v3.10.0 milestone Nov 14, 2023
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Nov 14, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Nov 16, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Nov 17, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Nov 27, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Dec 1, 2023
The `getCloneDepth()` function did not account for images that are in
the trash. A trashed image can only be opened by the image-id, and not
by name anymore.

Closes: ceph#4013
Signed-off-by: Niels de Vos <ndevos@ibm.com>
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Dec 14, 2023
@Rakshith-R Rakshith-R removed the wontfix This will not be worked on label Dec 15, 2023
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Jan 14, 2024
Copy link

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 22, 2024
@Rakshith-R Rakshith-R removed the wontfix This will not be worked on label Jan 23, 2024
@Rakshith-R Rakshith-R reopened this Jan 23, 2024
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Feb 22, 2024
@nixpanic nixpanic added keepalive This label can be used to disable stale bot activiity in the repo and removed wontfix This will not be worked on labels Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/rbd Issues related to RBD dependency/go-ceph depends on go-ceph functionality keepalive This label can be used to disable stale bot activiity in the repo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants