Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RBD Async: Failed to mirrored Cloned PVC created from another PVC (PVC-PVC Clone) #2426

Open
Madhu-1 opened this issue Aug 19, 2021 · 8 comments
Assignees
Labels
component/rbd Issues related to RBD keepalive This label can be used to disable stale bot activiity in the repo

Comments

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Aug 19, 2021

Failed to mirror cloned PVC created from the another PVC

Steps to Reproduce

  • Create a PVC
  • Create a Clone from the PVC
  • Create VolumeReplication to Enable Replication
I0819 06:47:21.653521       1 utils.go:176] ID: 1895 Req-ID: 0001-0009-rook-ceph-0000000000000005-26d8b1f5-00b9-11ec-89fe-0242ac110003 GRPC call: /replication.Controller/EnableVolumeReplication
I0819 06:47:21.653615       1 utils.go:178] ID: 1895 Req-ID: 0001-0009-rook-ceph-0000000000000005-26d8b1f5-00b9-11ec-89fe-0242ac110003 GRPC request: {"parameters":{"mirroringMode":"snapshot"},"secrets":"***stripped***","volume_id":"0001-0009-rook-ceph-0000000000000005-26d8b1f5-00b9-11ec-89fe-0242ac110003"}
I0819 06:47:21.657017       1 omap.go:86] ID: 1895 Req-ID: 0001-0009-rook-ceph-0000000000000005-26d8b1f5-00b9-11ec-89fe-0242ac110003 got omap values: (pool="replicapool-4", namespace="", name="csi.volume.26d8b1f5-00b9-11ec-89fe-0242ac110003"): map[csi.imageid:1972acb15df07 csi.imagename:csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003 csi.volname:pvc-4c80d66e-f7c5-4335-b058-fd4c9a26c68b csi.volume.owner:default]
E0819 06:47:21.774904       1 replicationcontrollerserver.go:245] ID: 1895 Req-ID: 0001-0009-rook-ceph-0000000000000005-26d8b1f5-00b9-11ec-89fe-0242ac110003 failed to enable mirroring on "replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003" with error: rbd: ret=-22, Invalid argument
E0819 06:47:21.774997       1 utils.go:185] ID: 1895 Req-ID: 0001-0009-rook-ceph-0000000000000005-26d8b1f5-00b9-11ec-89fe-0242ac110003 GRPC error: rpc error: code = Internal desc = failed to enable mirroring on "replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003" with error: rbd: ret=-22, Invalid argument

Even i tried to mirror the cloned rbd images manually.

sh-4.4# rbd mirror image enable replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003 snapshot
2021-08-19T06:49:15.892+0000 7ff2b877e2c0 -1 librbd::api::Mirror: image_enable: mirroring is not enabled for the parent

sh-4.4# rbd mirror image enable replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003-temp snapshot 
2021-08-19T06:52:31.379+0000 7fcfa8aca2c0 -1 librbd::api::Mirror: image_enable: mirroring is not enabled for the parent

sh-4.4# rbd info replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003     
rbd image 'csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 0
	id: 1972acb15df07
	block_name_prefix: rbd_data.1972acb15df07
	format: 2
	features: layering, operations
	op_features: clone-child
	flags: 
	create_timestamp: Thu Aug 19 06:46:17 2021
	access_timestamp: Thu Aug 19 06:46:17 2021
	modify_timestamp: Thu Aug 19 06:46:17 2021
	parent: replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003-temp@2a3f1d55-fdc6-424b-8e49-e46fa3ae9573
	overlap: 1 GiB
sh-4.4# rbd info replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003-temp
rbd image 'csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003-temp':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 1
	id: 1972a3c5fd23a
	block_name_prefix: rbd_data.1972a3c5fd23a
	format: 2
	features: layering, deep-flatten, operations
	op_features: clone-parent, clone-child, snap-trash
	flags: 
	create_timestamp: Thu Aug 19 06:46:16 2021
	access_timestamp: Thu Aug 19 06:46:16 2021
	modify_timestamp: Thu Aug 19 06:46:16 2021
	parent: replicapool-4/csi-vol-1b227fe2-00b9-11ec-89fe-0242ac110003@e91ecb33-f05d-469d-831e-200247bbbd38
	overlap: 1 GiB
sh-4.4# rbd info replicapool-4/csi-vol-1b227fe2-00b9-11ec-89fe-0242ac110003
rbd image 'csi-vol-1b227fe2-00b9-11ec-89fe-0242ac110003':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 1
	id: 1972a663f3297
	block_name_prefix: rbd_data.1972a663f3297
	format: 2
	features: layering, operations
	op_features: clone-parent, snap-trash
	flags: 
	create_timestamp: Thu Aug 19 06:45:56 2021
	access_timestamp: Thu Aug 19 06:45:56 2021
	modify_timestamp: Thu Aug 19 06:45:56 2021
@Madhu-1 Madhu-1 added the component/rbd Issues related to RBD label Aug 19, 2021
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 19, 2021

cc @ShyamsundarR

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 19, 2021

as we create a temporary clone we need to take care of image clone and image deletion properly. There should be no stale images on both cluster

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Sep 18, 2021
@Rakshith-R Rakshith-R removed the wontfix This will not be worked on label Sep 20, 2021
@Rakshith-R Rakshith-R added this to the release-3.5.0 milestone Sep 20, 2021
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Oct 20, 2021
@github-actions
Copy link

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

@Madhu-1 Madhu-1 reopened this Oct 28, 2021
@Madhu-1 Madhu-1 added keepalive This label can be used to disable stale bot activiity in the repo and removed wontfix This will not be worked on labels Oct 28, 2021
@humblec
Copy link
Collaborator

humblec commented Jan 4, 2022

@Madhu-1 this is marked against the 3.5.0 release, so please revisit the state.

@ushitora-anqou
Copy link

My team verified that clone PVC was mirrored to secondary Ceph cluster with v3.9.0 + attached POC patch. This patch removes two snap rm in the following PVC clone process.

https://github.com/ceph/ceph-csi/blob/devel/docs/design/proposals/rbd-snap-clone.md#volume-cloning-datasource-pvc

The steps to reproduce are:

  1. Create a two Ceph clusters C1 and C2.
  2. Setting RBD mirror from C1 to C2 by using Rook's CephRBDMirror CR.
  3. Create a PVC(PVC1) in C1.
  4. Create a VolumeReplication(VR1) corresponding to PVC1 in C1.
  5. Create a cloned PVC, PVC2, from PVC1.
  6. Create a VolumeReplication(VR2) corresponding to PVC2 in C1.
  7. Run rbd mirror image enable <image> snapshot for the following RBD images.
    7-1: The RBD image corresponding to PVC1 again.
    7-2: The RBD image corresnponding to the intermediate RBD image between PVC1 and PVC2.
    7-3: The RBD image corresponding to PVC2.
  8. Then both PVC1 and PVC2 are mirrored as expected.

We also verified that mirroring PVC2 didn't work with the plain v3.9.0. It's because step 7.2 and 7-3 failed. I suspect that it's due to the missing of both temporal RBD image snapshots. My patch skips the removal of these snapshots.

rbd-mirror-workaround-prototype.patch

@Rakshith-R
Copy link
Contributor

hey @nbalacha,
Can you please take a look at above comments ?

When mirroring a chain of rbd images, Do we need the intermediate rbd snapshots to be alive (not deleted/in trash) ?
Will this also be fixed by the patch you are working on?

Cephcsi deletes all intermediate RBD snapshots.

cc @idryomov @pkalever

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD keepalive This label can be used to disable stale bot activiity in the repo
Projects
None yet
Development

No branches or pull requests

4 participants