Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI snapshots of an erasure-coded-backed volume stores snapshot data in the metadata pool #4761

Open
Champ-Goblem opened this issue Aug 13, 2024 · 4 comments
Assignees
Labels
bug Something isn't working component/rbd Issues related to RBD keepalive This label can be used to disable stale bot activiity in the repo need test case

Comments

@Champ-Goblem
Copy link

Champ-Goblem commented Aug 13, 2024

For reference, I asked about the following in the Ceph Slack: https://ceph-storage.slack.com/archives/C05522L7P60/p1723555634369239

Describe the bug

When creating snapshots of erasure-coded RBD volumes the data is stored in the replicated metadata pool.

From the investigation into the ceph-csi code, the data pool option is not set during CreateSnapshot->GenVolFromVolID.
This instance of rbdVolume is passed to doSnapshotClone, which calls createRBDClone.
During the clone process:

  • a snapshot is created of the original volume
  • a new rbd image is created from the snapshot of the original image <- issue happens here
  • the original snapshot is deleted

cloneRbdImageFromSnapshot copies the RBD image options from the cloneRbdVol, created based on the rbdVolume generated from GenVolFromVolID. These options are passed to librbd.CloneImage and thus copies the data into the erasure-coded metadata pool, rather than keeping the data in the original erasure-coded pool.

This can be seen when inspecting the image in rbd with rbd info, the image is missing the data_pool field:

rbd info -p ec-metadatapool-us-east-1b csi-snap-12f5524f-de0d-4c21-bc4f-af843960337b
rbd image 'csi-snap-12f5524f-de0d-4c21-bc4f-af843960337b':
	size 30 GiB in 7680 objects
	order 22 (4 MiB objects)
	snapshot_count: 1
	id: 1933d7e8dc4d58
	block_name_prefix: rbd_data.1933d7e8dc4d58
	format: 2
	features: layering, deep-flatten, operations
	op_features: clone-child
	flags: 
	create_timestamp: Sat Aug 10 09:40:32 2024
	access_timestamp: Sat Aug 10 09:40:32 2024
	modify_timestamp: Sat Aug 10 09:40:32 2024
	parent: ec-metadatapool-us-east-1b/csi-vol-5aea1d4e-6575-492e-ad9d-f1c378d9f21c@a0684712-eadf-40b9-8d33-e46711a19fc2
	overlap: 30 GiB

Compare this to the original image:

rbd image 'csi-vol-5aea1d4e-6575-492e-ad9d-f1c378d9f21c':
	size 30 GiB in 7680 objects
	order 22 (4 MiB objects)
	snapshot_count: 88
	id: 17e84e87be3a0c
	data_pool: ec-blockpool-us-east-1b
	block_name_prefix: rbd_data.17.17e84e87be3a0c
	format: 2
	features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, data-pool, operations
	op_features: clone-parent, snap-trash
	flags: 
	create_timestamp: Thu Aug  8 14:44:59 2024
	access_timestamp: Mon Aug 12 10:24:16 2024
	modify_timestamp: Thu Aug  8 14:44:59 2024

Steps to reproduce

Steps to reproduce the behavior: Snapshot an erasure-coded volume

Expected behavior

The snapshot should remain within the original erasure-coded pool and not have data copied to the replicated metadata pool.

@Madhu-1 Madhu-1 added bug Something isn't working need test case component/rbd Issues related to RBD labels Aug 14, 2024
@Champ-Goblem
Copy link
Author

I have temporarily patched this problem internally so we can test if the change helps with our OSD disk usage.
I have generated a patch for the changes I made and I will upload it here for reference, in case it helps with the final fix.

From f3262f9fd84237c8617cd78f2a8792b871bc9ff2 Mon Sep 17 00:00:00 2001
From: Champ-Goblem <cameron@northflank.com>
Date: Wed, 14 Aug 2024 12:13:10 +0100
Subject: [PATCH] csi-snapshot: fetch data-pool info from rbd_get_data_pool_id
 api call

---
 internal/rbd/controllerserver.go          |  2 ++
 internal/rbd/rbd_util.go                  | 23 +++++++++++++++++++++++
 vendor/github.com/ceph/go-ceph/rbd/rbd.go |  8 ++++++++
 3 files changed, 33 insertions(+)

diff --git a/internal/rbd/controllerserver.go b/internal/rbd/controllerserver.go
index 4f07ce5f0..8e7d7b301 100644
--- a/internal/rbd/controllerserver.go
+++ b/internal/rbd/controllerserver.go
@@ -1328,6 +1328,8 @@ func (cs *ControllerServer) doSnapshotClone(
 	f := []string{librbd.FeatureNameLayering, librbd.FeatureNameDeepFlatten}
 	cloneRbd.ImageFeatureSet = librbd.FeatureSetFromNames(f)
 
+	cloneRbd.DataPool = parentVol.DataPool
+
 	err := cloneRbd.Connect(cr)
 	if err != nil {
 		return cloneRbd, err
diff --git a/internal/rbd/rbd_util.go b/internal/rbd/rbd_util.go
index 31b330b4d..ad3e07ce1 100644
--- a/internal/rbd/rbd_util.go
+++ b/internal/rbd/rbd_util.go
@@ -561,6 +561,17 @@ func (ri *rbdImage) isInUse() (bool, error) {
 	return len(watchers) > defaultWatchers, nil
 }
 
+func (ri *rbdImage) GetDataPoolId() (int64, error) {
+	image, err := ri.open()
+	if err != nil {
+		return -1, err
+	}
+	defer image.Close()
+
+	id, err := image.GetDataPoolId()
+	return id, err
+}
+
 // checkValidImageFeatures check presence of imageFeatures parameter. It returns false when
 // there imageFeatures is present and empty.
 func checkValidImageFeatures(imageFeatures string, ok bool) bool {
@@ -1158,6 +1169,18 @@ func generateVolumeFromVolumeID(
 	}
 	err = rbdVol.getImageInfo()
 
+	dataPoolId, getDataPoolErr := rbdVol.GetDataPoolId()
+	if dataPoolId != util.InvalidPoolID && getDataPoolErr == nil {
+		dataPoolName, getDataPoolErr := util.GetPoolName(rbdVol.Monitors, cr, dataPoolId)
+		if getDataPoolErr == nil {
+			rbdVol.DataPool = dataPoolName
+		} else {
+			log.ErrorLog(ctx, "failed to get data pool name: %w", getDataPoolErr)
+		}
+	} else if getDataPoolErr != nil {
+		log.ErrorLog(ctx, "failed to get data pool id: %w", getDataPoolErr)
+	}
+
 	return rbdVol, err
 }
 
diff --git a/vendor/github.com/ceph/go-ceph/rbd/rbd.go b/vendor/github.com/ceph/go-ceph/rbd/rbd.go
index b05b14397..2dea6c02f 100644
--- a/vendor/github.com/ceph/go-ceph/rbd/rbd.go
+++ b/vendor/github.com/ceph/go-ceph/rbd/rbd.go
@@ -1008,6 +1008,14 @@ func (image *Image) SetSnapshot(snapname string) error {
 	return getError(C.rbd_snap_set(image.image, cSnapName))
 }
 
+func (image *Image) GetDataPoolId() (int64, error) {
+	if err := image.validate(imageIsOpen); err != nil {
+		return -1, err
+	}
+
+	return int64(C.rbd_get_data_pool_id(image.image)), nil
+}
+
 // GetTrashList returns a slice of TrashInfo structs, containing information about all RBD images
 // currently residing in the trash.
 func GetTrashList(ioctx *rados.IOContext) ([]TrashInfo, error) {
-- 
2.46.0

@Champ-Goblem
Copy link
Author

@iPraveenParihar any update on this?

@iPraveenParihar
Copy link
Contributor

iPraveenParihar commented Aug 26, 2024

@Champ-Goblem, Thanks for opening up the issue.

Since DataPool is not set (should be in generateVolumeFromVolumeID()?) when doing snapshot/clone operations the images will not be in the required data pool.

Can we use GetString(RbdImageOptionDataPool) to get datapool value as we are setting the datapool value when creating an image?

if rv.DataPool != "" {
logMsg += ", data pool %s" + rv.DataPool
err = options.SetString(librbd.RbdImageOptionDataPool, rv.DataPool)
if err != nil {
return fmt.Errorf("failed to set data pool: %w", err)
}
}

@nixpanic WDYT?

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Sep 25, 2024
@iPraveenParihar iPraveenParihar added keepalive This label can be used to disable stale bot activiity in the repo and removed wontfix This will not be worked on labels Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/rbd Issues related to RBD keepalive This label can be used to disable stale bot activiity in the repo need test case
Projects
None yet
Development

No branches or pull requests

3 participants