Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Sends V*ReplicationClass to client #2727

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

raaizik
Copy link
Contributor

@raaizik raaizik commented Aug 1, 2024

Changes

Sends two VRCs (one for image flattening) and one VGRC. Depends on #2620

RHSTOR-5753, RHSTOR-5794

@raaizik
Copy link
Contributor Author

raaizik commented Aug 1, 2024

/cc @Madhu-1

@openshift-ci openshift-ci bot requested a review from Madhu-1 August 1, 2024 13:03
@raaizik raaizik force-pushed the RHSTOR-5753 branch 2 times, most recently from cb85825 to eeb74d1 Compare August 6, 2024 10:31
@raaizik
Copy link
Contributor Author

raaizik commented Aug 6, 2024

/cc @rewantsoni

@openshift-ci openshift-ci bot requested a review from rewantsoni August 6, 2024 10:35
Copy link
Member

@rewantsoni rewantsoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Volume*ReplicationClass we are sending only the parameters field like we did for Volume*SnapshotClass and StorageClass, Should we send the entire spec for it or maybe the entire object that will allow us to add labels/annotations required (ramen labels, reclaimspace annotation) from the provider itself and the client would be responsible for adding missing fields like ProvisionerName and namespace for secrets and then creating/upating it.

@nb-ohad WDYT?

Comment on lines +680 to +763
Data: mustMarshal(map[string]string{
"replication.storage.openshift.io/replication-secret-name": provisionerSecretName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For flattening VRC, we require the special parameter flattenMode: force to be added, we should add it from the provider side

Comment on lines 673 to 753
Data: mustMarshal(map[string]string{
"replication.storage.openshift.io/replication-secret-name": provisionerSecretName,
}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along with the replication secret name, we also require mirroringMode key that can we sent from the provider, could you add that as well in all the V*RC?
Ref: https://github.com/red-hat-storage/odf-multicluster-orchestrator/blob/main/controllers/drpolicy_controller.go#L170-L178

Comment on lines 694 to 790
Data: mustMarshal(map[string]string{
"replication.storage.openshift.io/group-replication-secret-name": provisionerSecretName,
}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For flattening VGRC, we require the special parameter flattenMode: force to be added, we should add it from the provider side

Copy link
Contributor

openshift-ci bot commented Aug 6, 2024

@raaizik: GitHub didn't allow me to request PR reviews from the following users: Rakshith-R.

Note that only red-hat-storage members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @Rakshith-R

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@raaizik
Copy link
Contributor Author

raaizik commented Aug 6, 2024

/cc @Rakshith-R

Copy link
Contributor

openshift-ci bot commented Aug 6, 2024

@raaizik: GitHub didn't allow me to request PR reviews from the following users: Rakshith-R.

Note that only red-hat-storage members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @Rakshith-R

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

}),
},
&pb.ExternalResource{
Name: "ceph-rbd-image-flattening",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Name: "ceph-rbd-image-flattening",
Name: "ceph-rbd-flatten",

Please make sure everything matches with https://github.com/red-hat-storage/odf-multicluster-orchestrator/blob/b3df88c23b06d80515f27a36f656296e2d413681/controllers/drpolicy_controller.go#L212

Copy link
Contributor Author

@raaizik raaizik Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice the name's suffix is a hashed scheduling interval. Since I am watching the new DRClusterConfig resource (that provides the sched intervals) from the client op, I'd need to override this field there. But I guess it doesn't matter on your end because all you care about is that the client op sends the correct name. Also, didn't you mean this?:

RBDFlattenVolumeReplicationClassNameTemplate = "rbd-flatten-volumereplicationclass-%v"

@raaizik raaizik force-pushed the RHSTOR-5753 branch 3 times, most recently from ffd5355 to 5c428c0 Compare August 7, 2024 12:47
Copy link
Member

@Rakshith-R Rakshith-R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"replication.storage.openshift.io/is-default-class": "true",
This should be added as a annotation

and
"replication.storage.openshift.io/flatten-mode": "force",

This as a label

hope both of the above will be handled accordingly in consumer side.

Other than the above, everything else looks good to me.

@raaizik
Copy link
Contributor Author

raaizik commented Aug 7, 2024

"replication.storage.openshift.io/is-default-class": "true", This should be added as a annotation

and "replication.storage.openshift.io/flatten-mode": "force",

This as a label

hope both of the above will be handled accordingly in consumer side.

Other than the above, everything else looks good to me.

@Rakshith-R I'm aware and it will.

@nb-ohad
Copy link
Contributor

nb-ohad commented Aug 7, 2024

For VolumeReplicationClass we are sending only the parameters field like we did for VolumeSnapshotClass and StorageClass, Should we send the entire spec for it or maybe the entire object that will allow us to add labels/annotations required (ramen labels, reclaimspace annotation) from the provider itself and the client would be responsible for adding missing fields like ProvisionerName and namespace for secrets and then creating/upating it.

@nb-ohad WDYT?

I will start with the later proposal first:

maybe the entire object that will allow us to add labels/annotations required

We do not want to do that, an object contains not just desired state but also status and some metadata that is local to the provider cluster. In internal discussions, we already acknowledged the need to send labels and annotations but not as part of the serialized spec we send (the data field of the ExternalResource struct). The plan is to add it to the metadata of the ExternalResource object, similar to the kind and name. Until that is implemented we will keep on having logic in the client to add the missing metadata.

Should we send the entire spec
This depends on the structure of the type, we originally used data because that was the desired state for configmaps and secrets. If the Volume*ReplicationClass has a formal spec field it will make sense to serialize it instead of a map[string]string.
As a matter of fact, we are already doing something similar for the ClusterResourceQuota CR (https://github.com/red-hat-storage/ocs-operator/blob/main/services/provider/server/server.go#L376-L399)

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 12, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2024
@raaizik
Copy link
Contributor Author

raaizik commented Aug 26, 2024

/test ocs-operator-bundle-e2e-aws

1 similar comment
@raaizik
Copy link
Contributor Author

raaizik commented Aug 26, 2024

/test ocs-operator-bundle-e2e-aws

@raaizik
Copy link
Contributor Author

raaizik commented Sep 5, 2024

/test ocs-operator-bundle-e2e-aws

@nb-ohad
Copy link
Contributor

nb-ohad commented Sep 8, 2024

/lgtm

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 8, 2024
@raaizik
Copy link
Contributor Author

raaizik commented Sep 8, 2024

/retest

@rewantsoni
Copy link
Member

/hold
Holding as per #2620 (comment)
As the comment requests to add metadata (labels and annotation) and we are sending the labels and annotations both in the Data(Spec) field.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 8, 2024
Copy link
Contributor

openshift-ci bot commented Sep 11, 2024

New changes are detected. LGTM label has been removed.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 11, 2024
@raaizik
Copy link
Contributor Author

raaizik commented Sep 11, 2024

/hold Holding as per #2620 (comment) As the comment requests to add metadata (labels and annotation) and we are sending the labels and annotations both in the Data(Spec) field.

Can unhold now, @rewantsoni 👍

@rewantsoni
Copy link
Member

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 11, 2024
@raaizik
Copy link
Contributor Author

raaizik commented Sep 11, 2024

/test ocs-operator-bundle-e2e-aws

Copy link
Member

@rewantsoni rewantsoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider sending the spec of Volume*ReplicationClass instead of spec.parameter from the provider server?

Comment on lines +1159 to +1166
} else if extResource.Kind == "VolumeReplicationClass" {
name = fmt.Sprintf("%s-volumereplicationclass", name)
} else if extResource.Kind == "VolumeGroupReplicationClass" {
name = fmt.Sprintf("%s-volumegroupreplicationclass", name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we are not creating these resources for filesystem, we don't need them here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not dependent on whether it's for FS or not

"replication.storage.openshift.io/replication-secret-name": provisionerSecretName,
}),
Labels: map[string]string{
"mirroringMode": "snapshot",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mirroringMode is not a label but a parameter under the spec (spec.parameters) section of the VolumeReplicationClass

Comment on lines 35 to 36
Labels any `json:"labels"`
Annotations any `json:"annotations"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we know the type for Labels and Annotations, let's use that instead of using any

Comment on lines 35 to 36
Labels any `json:"labels"`
Annotations any `json:"annotations"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are missing assert on labels in the tests

@raaizik raaizik force-pushed the RHSTOR-5753 branch 2 times, most recently from dffe4af to bd6ccbb Compare September 11, 2024 12:27
@raaizik
Copy link
Contributor Author

raaizik commented Sep 11, 2024

Should we consider sending the spec of Volume*ReplicationClass instead of spec.parameter from the provider server?

I think we've had a previous discussion regarding this or similar. IMO the server should send only the info that's required and not the entire spec.

Sends two V*RCs (one for image flattening) for RDR

Signed-off-by: raaizik <132667934+raaizik@users.noreply.github.com>
Co-Authored-By: Rewant Soni <rewant.soni@gmail.com>
@raaizik raaizik changed the title Sends V*ReplicationClass to client [WIP] Sends V*ReplicationClass to client Sep 11, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 11, 2024
Signed-off-by: raaizik <132667934+raaizik@users.noreply.github.com>
@raaizik
Copy link
Contributor Author

raaizik commented Sep 23, 2024

/test ocs-operator-bundle-e2e-aws

Comment on lines +358 to +359
assert.True(t, reflect.DeepEqual(extResource.Labels, mockResoruce.Labels))
assert.True(t, reflect.DeepEqual(extResource.Annotations, mockResoruce.Annotations))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use assert.Equal() it uses reflect.DeepEqual internally

Copy link
Contributor Author

@raaizik raaizik Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both labels and annotations are of type map, which means that in order to use assert.Equal() I'd need to convert from map to []byte or string which is not straightforward. Therefore I've chosen this assertion instead so it's compact as a one-liner.

Copy link
Member

@rewantsoni rewantsoni Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to convert the map to byte or string. See here.

}),
Labels: getExternalResourceLabels("VolumeReplicationClass", replicationEnabled, mirroringEnabled, true, ID, SID),
Annotations: map[string]string{
"replication.storage.openshift.io/is-default-class": "true",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what I recall, we don't require this annotation for flatten VRC

Copy link
Contributor Author

@raaizik raaizik Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure it is needed. See here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MCO is creating two VRC's one with flatten mode and one without flatten.
The one that is without flatten mode has the annotation the other one doesn't. See here and here
We are setting the annotations to be empty when creating the flatten mode VRC.

Also the same in mentioned in the design doc Changes required

}),
Labels: getExternalResourceLabels("VolumeGroupReplicationClass", replicationEnabled, mirroringEnabled, true, ID, SID),
Annotations: map[string]string{
"replication.storage.openshift.io/is-default-class": "true",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what I recall, we don't require this annotation for flatten VGRC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See reply

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See reply

@@ -832,6 +911,39 @@ func (s *OCSProviderServer) GetStorageClaimConfig(ctx context.Context, req *pb.S

}

func getExternalResourceLabels(kind string, isReplicationEnabled bool, isMirroringEnabled bool, isFlattenMode bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to differentiate between the Replication and Mirroring flag, we can have one mirroring flag (which will check the cephblockpool) and add labels based on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective, there's no other way to handle these flags. Replication is Vol*RepClass specific, while Mirroring is aimed for both Vol*RepClass and Vol*SnapClass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your replication flag depends on storageClusterPeer, whereas the mirroring flag is derived from CephBlockPool.

Enabling mirroring on the cephblockpool is the responsibility of the storageClusterPeer. Isn't both pointing to the same field eventually?

I think we should add the labels for rbd-based VRC/VSC and SC based only on the mirroring field. It is possible that a particular blockpool is not selected for mirroring from StorageClusterPeer CR. See here

I was going through the PR and I am not able to find the labels applied for CephFS-based V*SC and SC. Do we not need them?

Copy link
Contributor Author

@raaizik raaizik Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabling mirroring on the cephblockpool is the responsibility of the storageClusterPeer.

I am not aware of such thing. How does it fetch this flag exactly? Is it merged already?

I am not able to find the labels applied for CephFS-based V*SC and SC

As discussed, all of this will be handled in another PR. This PR refers only to Vol*RepClass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware of such thing. How does it fetch this flag exactly? Is it merged already?

This was discussed in the design discussion. Each StorageClusterPeer will enable mirroring on the cephblockpool based on the label selector.

For rbd we can use the mirroring field on the cephblockpool to add ramenRelated labels, we don't need another replication flag explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're misunderstanding. I am using the Mirroring flag solely for the StorageID label. It's unrelated to RepID.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two flags for the same set of ramen labels? From what I understand we are creating a differentiation which we don't need. It will just complicate the things.

As an example what if we have 2 cephblockpool a and b the storageclusterpeer is configured only for blockpool a.

There are two clients 1 and 2. 1 uses blockpool a, 2 uses blockpool b.

If we keep the flags separate,

For client 1 we will enable both replication and storage id label

For client 2 we will enable only replication id label

In the second case we don't require any ramen related labels to be added as that blockpool is not configured for mirroring.

Hence we don't need to check for both storageclusterpeer and cephblockpool. Checking only the cephblockpool will work

ID = hex.EncodeToString(md5Sum[:16])
} // todo storageID in provider is planned to be string(r.storageClaim.UID)
// SID for RamenDR
SID := req.StorageConsumerUUID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StorageID is used to match V*C to StorageClass, we don't need it to be storageClaim UID, we can have it as storageRequest UID as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've already agreed that it'll be string(r.storageClaim.UID) in #168. Eventually the value that will be assigned to this ID is the one from the client, so it actually doesn't matter what value is chosen on this end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the concept for storageClaim might go away in the future because of convergence, we should pivot and not use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even so, the final value assigned would be determined on the client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to determine the label value on the client if we can do it from the provider itself?

Copy link
Contributor Author

@raaizik raaizik Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, I'm saying with the current state of things in these PRs this is what would happen. After this PR will get merged, the value assigned from here will be propagated to the client and changes will be reflected in the mentioned PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that we could but I think as storageID is an arbitrary value used by ramen to cross-link the VRC/VSC and SC, the hash of storageRequest UID perfectly fits our use case. I don't think we need to change it further on the client. Let's wait for others to review this.

@raaizik
Copy link
Contributor Author

raaizik commented Sep 23, 2024

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants