CFE-1131: AWS Tags DAY2 Update #297

anirudhAgniRedhat · 2024-10-11T11:25:26Z

This PR introduces a custom EBSVolumeTagController that monitors the OpenShift Infrastructure resource for changes in AWS ResourceTags. When tags are updated, the controller automatically fetches all AWS EBS-backed PersistentVolumes (PVs) in the cluster, retrieves their volume IDs, and updates the associated EBS tags in AWS.

Key Changes:

Monitors Infrastructure resource for AWS ResourceTags updates.
Directly fetches all PVs using the AWS EBS CSI driver (ebs.csi.aws.com).
Updates AWS EBS tags by merging new and existing tags using the AWS SDK.

openshift-ci-robot · 2024-10-11T11:25:29Z

@anirudhAgniRedhat: This pull request references CFE-1131 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR introduces a custom EBSVolumeTagController that monitors the OpenShift Infrastructure resource for changes in AWS ResourceTags. When tags are updated, the controller automatically fetches all AWS EBS-backed PersistentVolumes (PVs) in the cluster, retrieves their volume IDs, and updates the associated EBS tags in AWS.

Key Changes:

Monitors Infrastructure resource for AWS ResourceTags updates.
Directly fetches all PVs using the AWS EBS CSI driver (ebs.csi.aws.com).
Updates AWS EBS tags by merging new and existing tags using the AWS SDK.
Graceful operator restart on tag changes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

anirudhAgniRedhat · 2024-10-11T11:25:54Z

/hold
Please don't review now currently WIP

openshift-ci-robot · 2024-10-14T15:35:16Z

@anirudhAgniRedhat: This pull request references CFE-1131 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR introduces a custom EBSVolumeTagController that monitors the OpenShift Infrastructure resource for changes in AWS ResourceTags. When tags are updated, the controller automatically fetches all AWS EBS-backed PersistentVolumes (PVs) in the cluster, retrieves their volume IDs, and updates the associated EBS tags in AWS.

Key Changes:

Monitors Infrastructure resource for AWS ResourceTags updates.
Directly fetches all PVs using the AWS EBS CSI driver (ebs.csi.aws.com).
Updates AWS EBS tags by merging new and existing tags using the AWS SDK.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

anirudhAgniRedhat · 2024-10-15T18:07:36Z

/unhold
open for reviews

TrilokGeer · 2024-10-16T07:35:47Z

cmd/aws-ebs-csi-driver-operator/main.go

+
+	go ebsTagsController.Run(ctx)
+
+	klog.Info("EBS Volume Tag Controller is running")


nit: TBD - cleanup info logs before merge.

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

anirudhAgniRedhat · 2024-10-21T04:16:29Z

/cc @jsafrane
PTAL!!

jsafrane

Please update cluster-storage-operator to add token-minter sidecar.

cmd/aws-ebs-csi-driver-operator/main.go

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

jsafrane · 2024-10-21T14:55:31Z

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

+			if err != nil {
+				klog.Errorf("Error updating tags for volume %s: %v", volumeID, err)


There should be retry with exp. backoff. Esp. when CreateTags calls are throttled by AWS.
That probably implies a queue of PVs.

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

gnufied · 2024-10-22T19:42:36Z

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

+	infraInformer := c.commonClient.ConfigInformers.Config().V1().Infrastructures().Informer()
+
+	// Add event handler to process updates only when ResourceTags change
+	infraInformer.AddEventHandler(cache.ResourceEventHandlerFuncs{


We don't have do this. If we are using factory.WithInformers, we will reconcile this within Sync function.

Hey @gnufied I was bit confused with this!! basically I don't want to run reconciliation on every change on InfraStructure resource. I need to run reconciliation only if there is a any change in infra.Status.PlatformStatus.AWS.ResourceTags.

Can you suggest a better way to do this so that we can remove the unnecessarily computes.

How will this work when controller is restarted? You are also doing WithInformers below and hence any change in infra object will still trigger Sync.

So - most bulletproof way of ensuring that, we don't unnecessarily process all PVs is to store the information that we have processed these PVs somewhere in a persistent way.

So, what we have currently is worst of both the worlds. If I were to design this, I will probably make a hash of sorted tags and annotate PV with tag hash. If tag hash annotation in PV and computed tag hash don't change, then I will not update the PV or else I will.

How will this work when controller is restarted? You are also doing WithInformers below and hence any change in infra object will still trigger Sync.

What I am thinking is on every restart I would like to run a sync function and update the volumes if there is a change and further On each change in resource-tags we would again run the sync.

So, what we have currently is worst of both the worlds. If I were to design this, I will probably make a hash of sorted tags and annotate PV with tag hash. If tag hash annotation in PV and computed tag hash don't change, then I will not update the PV or else I will.

This looks easy way to manage this!! I will then add a new field in controller struct which will have the updated hash for the sorted map of tags! now in each reconciliation I will update tags only if the hash is different from the other one? Does this sounds better to you??

But that is not what I said. I said, we should store hash of sorted tags in PV objects as annotation and compare those with current tags we are about to apply. We should only apply tags with AWS if hashes change.

Storing them just in-memory doesn't help us much.

Added Thanks For Suggestion!!

gnufied · 2024-10-23T13:28:01Z

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

+	if err != nil {
+		return err
+	}
+	err = c.processInfrastructure(infra, ec2Client)


So, does this controller needs to be an opt-in or a default controller? I don't assume every OCP customer wants this feature and if tagging were to fail after OCP upgrade, their clusters will be degraded and we will have support nightmare.

cc @jsafrane

It is presented as enabled by default + no opt out for all HyperShift clusters in the enhancement.

gnufied · 2024-10-23T13:28:24Z

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

+}
+
+// startFailedQueueWorker runs a worker that processes failed batches independently
+func (c *EBSVolumeTagsController) startFailedQueueWorker(ctx context.Context) {


I am not sure if we need this function at all.

So the reason I brought this change is I would like to retry the the batches which have failed to update tags!! Here I would like to update tags in a serial order(one By one) Discussuion link for the volumes. I cannot use the similar sync function here or else you could suggest a better way for this!!

If we are going to per-pv hash, then we are not going to need this right?

The points I want to make sure here are

I definitely need to batch volumes in order to not to hit throttling condition.

Now on failure we would need to retry!!

Since we are using batch APIs, so AWS SDK's APIs will give us error if any one of the VolumeID in the batch hits the any error(May be validation, Auth, Permission etc), All the VolumeIDs in that batch will not be able to update the tags! In this condition I would like to add a worker queue that will handle the the serial update of PV's tags and will retry to update the tags in exponential back-off time-period.

Since We need to figure out the trade-off between the either in first place we should update all PVs using AWS API's in serial order and retry using similar sync function or should we use Batch Volumes to be called in the sync Functions and later retry should be processed with the queue function serially until the queue is empty!!

I definitely need to batch volumes in order to not to hit throttling condition.

That is fine. You are already batching PVs in fetchPVsAndUpdateTags.

Now on failure we would need to retry!!

We are unnecessarily complicating this. The controller is going to resync every 20 minutes anyways, so it will try to tag all the PVs which aren't tagged. So do we even need to keep separate queue for failed PVs? I am also afraid that, your failed worker queue is going to race with normal controller resync.

What is the point of separate failed worker queue when every 20 minutes, we are going to try and sync tag for all PVs which doesn't have matching tag-hash? If you really want a separate worker queue, you will have to redesign the whole thing, so as at least they are not racy. But I don't really understand point of doing it.

Hey @gnufied I completely agree that this retrying the failure is unnecessarily making the changes complicated.

@jsafrane I had a chat with @TrilokGeer regarding this, IMHO we should drop the idea to add degraded condition on tags update failure. As the cluster should not be degraded for failure to apply Tags. as the sync will anyhow retry to tag the volumes withing resync period, in this way we will not immediately require to retry the failures and can remove this queue worker.
Also anyways we are emitting the warning events from the this controller if the tags update is failed. so user will know that the tags update has been failed due the the certain reason also can think of alerts based on that.

/cc @TrilokGeer
Can you also put your views on this!!

BTW are we planning to backport this PR to older releases?

@gnufied I guess we would need to backport this to 4.17. Slack Thread.

So I noticed that you removed failed worker logic. The thing is - I was talking to @jsafrane offline and he has me convinced that, we do need some kind of additional logic so as we can try retagging of "failed" PVs one-by-one, rather than in a batch. This will ensure that, one bad apple in a batch doesn't prevent tagging of rest of the PVs.

But - we need to be careful when doing this.

We should make sure that, PVs which will be tagged via failed worker, doesn't get processed via regular controller resync (so no race).

I would move the entire failed worker code in a separate file.

Ack;
I believe we can remove the ResyncEvery parameter from the controller builder as this is a overkill for us why do we want to run the sync function in every 20 mins if nothing has changed in the resource Tags.

Alternatively, If you really think resync is important here then, I think that we can add the handle the race condition by using another annotation in PVs for Tagging status and filtering based on tagshash and status in the annotation. But this will also cost us some volume Update calls and further need to handle cases where we are not able to update the status within batches. WDYT?

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

anirudhAgniRedhat · 2024-11-07T08:10:03Z

Please update cluster-storage-operator to add token-minter sidecar.

Added PR to add token minter sidecar PTAL!
openshift/cluster-storage-operator#528

anirudhAgniRedhat · 2024-11-07T11:53:55Z

/retest
/test e2e-openstack-cinder-csi

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

pkg/driver/aws-ebs/aws_ebs_tags_retry_worker.go

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

openshift-ci · 2024-12-14T14:43:38Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: anirudhAgniRedhat
Once this PR has been reviewed and has the lgtm label, please assign mpatlasov for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

anirudhAgniRedhat · 2024-12-16T04:13:20Z

/test e2e-openstack-manila-csi
/test hypershift-e2e-openstack-csi-cinder
/test hypershift-e2e-openstack-csi-manila
/test e2e-azurestack-csi
/test smb-win2022-operator-e2e
/test e2e-azure-file-nfs-csi
/test okd-scos-e2e-aws-ovn

TrilokGeer · 2024-12-16T10:42:46Z

pkg/driver/aws-ebs/aws_ebs_tags_controller.go

+		// Check if the volume is a CSI volume with the correct driver
+		if volume.Spec.CSI != nil && volume.Spec.CSI.Driver == driverName &&
+			// Ensure the volume is not already in the failed queue
+			!c.isVolumeInFailedQueue(volume.Name) &&


On a new update, the update on the tag resource depends exponential backoff. This means, there is an accumulated delay due to previous errors that is not reset on an update. This can also reach a maximum backoff time.

Yes @TrilokGeer, Here we cannot update the tags immediately if the volume is in a failed queue.
But it would be even worse if we try to refresh the queue by removing all volume from the queue and then retry to update the tags in batches. as we are calling the sync func here when there is any update on Infrastructure resource.

TrilokGeer · 2024-12-16T11:05:59Z

pkg/driver/aws-ebs/aws_ebs_tags_retry_worker.go

+
+	klog.Infof("Retrying failed volume: %v", pvName)
+
+	infra, err := c.getInfrastructure()


Race condition: As I understand from @anirudhAgniRedhat , the worker is a separate thread running concurrently. In that case, the call here does not gurantee that the infrastructure being read is latest. There is probability of older version infrastructure tags and updating to the pv. As this operation is not iterated until next update, it will lead to a situation where some of the volumes carry older versions of tags.

Correct maybe we would need to resync in every x mins, if the volume is out of sync due to reading older data here.

WDYT @gnufied @jsafrane ?

TrilokGeer · 2024-12-16T11:12:30Z

pkg/driver/aws-ebs/aws_ebs_tags_retry_worker.go

+	}
+
+	if c.needsTagUpdate(infra, pv) {
+		c.updateTags(ctx, pv, infra.Status.PlatformStatus.AWS.Region, infra.Status.PlatformStatus.AWS.ResourceTags)


Based on the earlier assumption about worker queue being a concurrent thread, update tags becomes a critical section between sync and worker. To avoid race conditions, it is best to consider a single thread to have update call functionality.

TrilokGeer · 2024-12-16T11:30:16Z

pkg/driver/aws-ebs/aws_ebs_tags_retry_worker.go

+	}
+
+	klog.Infof("Successfully updated PV annotations for volume %s", pv.Name)
+	c.failedQueue.Forget(pv.Name)


Maybe, there should be a consideration to provide a status to user about an in-progress activity. With no updates, user cannot know if the tags have been applied successfully, failed or in-progress.

Maybe a separate set of events might help?

TrilokGeer · 2024-12-16T11:37:50Z

AFAIU, it is good to consider

Sync loop blocking time on tag updates to optimize the success cases.
For error cases, tag data consistency is required between sync and worker updates. Reading infrastructure object in both threads and independent updates leads to race conditions.
A way to inform about the status of the present tag update action.
@jsafrane @gnufied @anirudhAgniRedhat

anirudhAgniRedhat · 2024-12-17T17:45:28Z

/retest

openshift-ci · 2024-12-17T20:21:31Z

@anirudhAgniRedhat: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/hypershift-e2e-openstack-csi-cinder	`869a428`	link	true	`/test hypershift-e2e-openstack-csi-cinder`
ci/prow/hypershift-e2e-openstack-csi-manila	`869a428`	link	true	`/test hypershift-e2e-openstack-csi-manila`
ci/prow/aws-efs-operator-e2e-extended	`20a925b`	link	false	`/test aws-efs-operator-e2e-extended`
ci/prow/smb-win2022-operator-e2e	`20a925b`	link	false	`/test smb-win2022-operator-e2e`
ci/prow/okd-scos-e2e-aws-ovn	`dd3e77c`	link	false	`/test okd-scos-e2e-aws-ovn`
ci/prow/e2e-azurestack-csi	`dd3e77c`	link	false	`/test e2e-azurestack-csi`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 11, 2024

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 11, 2024

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 11, 2024

openshift-ci bot requested review from dobsonj and RomanBednar October 11, 2024 11:28

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from 4d92e01 to 14ecc0e Compare October 14, 2024 12:42

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from d68d71a to 44e34e7 Compare October 14, 2024 16:00

anirudhAgniRedhat changed the title ~~[WIP] CFE-1131: AWS Tags DAY2 Update~~ CFE-1131: AWS Tags DAY2 Update Oct 14, 2024

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 14, 2024

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 15, 2024

TrilokGeer reviewed Oct 16, 2024

View reviewed changes

pkg/driver/aws-ebs/aws_ebs_tags_controller.go Show resolved Hide resolved

openshift-ci bot requested a review from jsafrane October 21, 2024 04:16

jsafrane reviewed Oct 21, 2024

View reviewed changes

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch 2 times, most recently from 146634c to 858b442 Compare October 22, 2024 18:47

gnufied reviewed Oct 22, 2024

View reviewed changes

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from 858b442 to fe7b17e Compare October 23, 2024 08:31

gnufied reviewed Oct 23, 2024

View reviewed changes

gnufied reviewed Oct 24, 2024

View reviewed changes

pkg/driver/aws-ebs/aws_ebs_tags_controller.go Outdated Show resolved Hide resolved

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch 3 times, most recently from a9b8b54 to 42579da Compare October 25, 2024 11:40

gnufied reviewed Oct 26, 2024

View reviewed changes

pkg/driver/aws-ebs/aws_ebs_tags_controller.go Outdated Show resolved Hide resolved

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch 2 times, most recently from e70bfeb to c0e7dd8 Compare November 4, 2024 13:02

anirudhAgniRedhat mentioned this pull request Nov 5, 2024

CFE-1132: EFS Access Point Tags Update DAY2 #313

Open

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from c1ae14d to c62693a Compare November 7, 2024 05:18

anirudhAgniRedhat requested review from jsafrane and gnufied November 7, 2024 07:28

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from c62693a to 869a428 Compare November 7, 2024 08:07

gnufied reviewed Nov 7, 2024

View reviewed changes

jsafrane mentioned this pull request Nov 19, 2024

CFE-1162: Updates enhancement to reflect hcp usecase and with latest information on aws tags support openshift/enhancements#1700

Open

gnufied reviewed Dec 10, 2024

View reviewed changes

pkg/driver/aws-ebs/aws_ebs_tags_controller.go Outdated Show resolved Hide resolved

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from ea5ac48 to aec3b99 Compare December 14, 2024 17:30

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 14, 2024

Added tag controller for tags reconciliation

b9746ee

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from aec3b99 to 73dcb42 Compare December 14, 2024 17:34

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 14, 2024

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from 73dcb42 to 20a925b Compare December 16, 2024 08:29

TrilokGeer reviewed Dec 16, 2024

View reviewed changes

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch 3 times, most recently from 911ea8c to 3d1a4e2 Compare December 16, 2024 13:32

Added Review nits

dd3e77c

anirudhAgniRedhat force-pushed the AWS_DAY2_TAGS_RECONCILIATION branch from 3d1a4e2 to dd3e77c Compare December 17, 2024 13:56


		go ebsTagsController.Run(ctx)

		klog.Info("EBS Volume Tag Controller is running")

		if err != nil {
		klog.Errorf("Error updating tags for volume %s: %v", volumeID, err)


		klog.Infof("Retrying failed volume: %v", pvName)

		infra, err := c.getInfrastructure()

CFE-1131: AWS Tags DAY2 Update #297

Are you sure you want to change the base?

CFE-1131: AWS Tags DAY2 Update #297

Conversation

anirudhAgniRedhat commented Oct 11, 2024 • edited Loading

openshift-ci-robot commented Oct 11, 2024 • edited by openshift-ci bot Loading

anirudhAgniRedhat commented Oct 11, 2024

openshift-ci-robot commented Oct 14, 2024 • edited by openshift-ci bot Loading

anirudhAgniRedhat commented Oct 15, 2024

Choose a reason for hiding this comment

anirudhAgniRedhat commented Oct 21, 2024

jsafrane left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnufied Oct 26, 2024 • edited Loading

Choose a reason for hiding this comment

anirudhAgniRedhat Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anirudhAgniRedhat Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

anirudhAgniRedhat commented Nov 7, 2024

anirudhAgniRedhat commented Nov 7, 2024

openshift-ci bot commented Dec 14, 2024

anirudhAgniRedhat commented Dec 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TrilokGeer commented Dec 16, 2024 • edited Loading

anirudhAgniRedhat commented Dec 17, 2024

openshift-ci bot commented Dec 17, 2024

anirudhAgniRedhat commented Oct 11, 2024 •

edited

Loading

openshift-ci-robot commented Oct 11, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Oct 14, 2024 •

edited by openshift-ci bot

Loading

gnufied Oct 26, 2024 •

edited

Loading

anirudhAgniRedhat Oct 28, 2024 •

edited

Loading

anirudhAgniRedhat Oct 30, 2024 •

edited

Loading

TrilokGeer commented Dec 16, 2024 •

edited

Loading