-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add ignore-resources-tracking annotation to ignore resources update #18343
feat: Add ignore-resources-tracking annotation to ignore resources update #18343
Conversation
cc @agaudreault |
controller/cache/cache.go
Outdated
@@ -67,6 +68,9 @@ const ( | |||
|
|||
// EnvClusterCacheRetryUseBackoff is the env variable to control whether to use a backoff strategy with the retry during cluster cache sync | |||
EnvClusterCacheRetryUseBackoff = "ARGOCD_CLUSTER_CACHE_RETRY_USE_BACKOFF" | |||
|
|||
// AnnotationIgnoreResourcesTracking is a Kubernetes annotation for a Kubernetes resource to ignore any resources tracking | |||
AnnotationIgnoreResourcesTracking = "argocd.argoproj.io/ignore-resources-tracking" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the same terminology. resource tracking means it does not show up in Argo UI and this would lead to confusion
AnnotationIgnoreResourcesTracking = "argocd.argoproj.io/ignore-resources-tracking" | |
AnnotationIgnoreResourcesUpdate = "argocd.argoproj.io/ignore-resources-update" |
controller/cache/cache.go
Outdated
// annotations stores all the ObjectRef annotations | ||
annotations map[string]string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can save memory here and just have a bool instead. annotations like kubectl.kubernetes.io/last-applied-configuration
can end up consuming a lot if you consider the number of resources argo watches
controller/cache/cache.go
Outdated
@@ -342,6 +349,14 @@ func skipAppRequeuing(key kube.ResourceKey) bool { | |||
} | |||
|
|||
func skipResourceUpdate(oldInfo, newInfo *ResourceInfo) bool { | |||
if val, ok := newInfo.annotations[AnnotationIgnoreResourcesTracking]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should ignore only if the health status did not change.
controller/cache/cache.go
Outdated
@@ -549,7 +564,7 @@ func (c *liveStateCache) getCluster(server string) (clustercache.ClusterCache, e | |||
"name": ref.Name, | |||
"api-version": ref.APIVersion, | |||
"kind": ref.Kind, | |||
}).Debug("Ignoring change of object because none of the watched resource fields have changed") | |||
}).Debugf("Ignoring change of object because none of the watched resource fields have changed or annotation %v is set to true", AnnotationIgnoreResourcesTracking) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the debug message adds much information. I would either leave it as-is, or perhaps add a log field with the values of the annotation. This way we can query logs to know what was ignored because of annotations.
}).Debugf("Ignoring change of object because none of the watched resource fields have changed or annotation %v is set to true", AnnotationIgnoreResourcesTracking) | |
}).Debug("Ignoring change of object because none of the watched resource fields have changed") |
@kahoulei Something I was thinking but had not the time to complete was to do the same, but with an "allow" annotation instead of a "ignore". Basically, The advantage was that you dont have to always ignore the full resources, but you can laverage the |
@agaudreault thanks for the review. If I understand correctly, for the resource that we want to ignore we should set If |
Correct, if the annotation is there and is true, we would generate a hash and compare it later on. If the annotation is false or missing, then we do nothing (like what we have today) |
I thought we will generate hash if see https://github.com/argoproj/argo-cd/blob/master/controller/cache/cache.go#L507-L514 |
@agaudreault PTAL if you get a chance. Thanks! |
controller/cache/info.go
Outdated
if k == AnnotationApplyResourcesUpdate { | ||
value, err := strconv.ParseBool(v) | ||
if err != nil { | ||
value = false | ||
} | ||
res.applyResourcesUpdate = value | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code can be simplified by just checking this at the end of shouldHashManifest
method.
If the shouldHashManifest return true because the annotations is present for an untracked resource, then it will take the same codepath as current behavior for tracked resources.
No need to update the ResourceInfo, store booleans or annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agaudreault updated with your suggestion.
One additional change in skipResourceUpdate
: since we are not going to generate hash and therefore the logic of isSameManifest
need to change. So PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We simply cannot generate a hash by default or skip the resources updates for all the resources. When the annotation is not defined, the hash should not be generated. If you generate a hash by default for all resource updates, the controller will use a lot more CPU generating hashes than what you will save skipping reconcile.
I also think we should not go in a direction where non-argocd resources must have argocd annotations.
That is why I think the best approach is:
- Generate the hash only if annotation is present and true.
- (already done) Skip the resource updates if the hashes exist and the hashes match OR if the resource does not belong to any application
…date Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
42fb0bb
to
140c3d8
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18343 +/- ##
==========================================
- Coverage 50.69% 50.69% -0.01%
==========================================
Files 315 315
Lines 43381 43389 +8
==========================================
+ Hits 21991 21995 +4
- Misses 18882 18888 +6
+ Partials 2508 2506 -2 ☔ View full report in Codecov by Sentry. |
@agaudreault I have added unit tests and they all passed. But the e2e tests are failing and looks like they are not related. Can you help me to take a look? (Sorry I am still not familiar with the e2e tests yet) |
controller/cache/cache.go
Outdated
@@ -348,20 +352,37 @@ func skipResourceUpdate(oldInfo, newInfo *ResourceInfo) bool { | |||
return false | |||
} | |||
isSameHealthStatus := (oldInfo.Health == nil && newInfo.Health == nil) || oldInfo.Health != nil && newInfo.Health != nil && oldInfo.Health.Status == newInfo.Health.Status | |||
isSameManifest := oldInfo.manifestHash != "" && newInfo.manifestHash != "" && oldInfo.manifestHash == newInfo.manifestHash | |||
isSameManifest := oldInfo.manifestHash == newInfo.manifestHash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition should not change. If a resource does not have a hash, that means the annotation was not there (or not true), therefore we should not ignore updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I misunderstood your intention. So you want is the dependent object set argocd.argoproj.io/apply-resources-update=true
and then use the existing ignore resources config to skip refresh. I will update the PR.
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic looks good. Most comments are about terminology and user documentation to polish.
controller/cache/cache.go
Outdated
// AnnotationApplyResourcesUpdate when set to true on a resource that is not tracked under an app, argocd will generate a | ||
// hash and apply `ignoreResourceUpdate` configuration on it. If the annotation is set to false (or not presented) on a resource | ||
// that is not tracked under an app, ignoreResourceUpdates configuration will not be applied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaking implementation details on the API.
// AnnotationApplyResourcesUpdate when set to true on a resource that is not tracked under an app, argocd will generate a | |
// hash and apply `ignoreResourceUpdate` configuration on it. If the annotation is set to false (or not presented) on a resource | |
// that is not tracked under an app, ignoreResourceUpdates configuration will not be applied. | |
// AnnotationApplyResourcesUpdate when set to true on an untracked resource, | |
// argo will apply `ignoreResourceUpdates` configuration on it. |
controller/cache/cache.go
Outdated
// AnnotationApplyResourcesUpdate when set to true on a resource that is not tracked under an app, argocd will generate a | ||
// hash and apply `ignoreResourceUpdate` configuration on it. If the annotation is set to false (or not presented) on a resource | ||
// that is not tracked under an app, ignoreResourceUpdates configuration will not be applied. | ||
AnnotationApplyResourcesUpdate = "argocd.argoproj.io/apply-resources-update" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the code is written, I think it will be much easier for users to have the same terminology with the existing feature. wdyt?
AnnotationApplyResourcesUpdate = "argocd.argoproj.io/apply-resources-update" | |
AnnotationIgnoreResourceUpdates = "argocd.argoproj.io/ignore-resource-updates" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agaudreault I think AnnotationApplyResourcesUpdate
is more logical to me. AnnotationIgnoreResourceUpdates
seems the invert of what we are doing and therefore we need to flip our logic if we rename it.
controller/cache/cache.go
Outdated
// Best - Only hash for resources that are part of an app or their dependencies | ||
// (current) - Only hash for resources that are part of an app + all apps that might be from an ApplicationSet | ||
// Orphan - If orphan is enabled, hash should be made on all resource of that namespace and a config to disable it | ||
// Worst - Hash all resources watched by Argo | ||
return appName != "" || (gvk.Group == application.Group && gvk.Kind == application.ApplicationKind) | ||
isTrackedResources := appName != "" || (gvk.Group == application.Group && gvk.Kind == application.ApplicationKind) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one resource
isTrackedResources := appName != "" || (gvk.Group == application.Group && gvk.Kind == application.ApplicationKind) | |
isTrackedResource := appName != "" || (gvk.Group == application.Group && gvk.Kind == application.ApplicationKind) |
docs/operator-manual/reconcile.md
Outdated
|
||
## Tracking Dependent Resources | ||
|
||
Dependent resources by default are not being tracked. Therefore, we cannot generate any hash of those objects and utilize | ||
the `ignoreResourceUpdates` configuration. | ||
|
||
If you want to track the dependent object and apply the `ignoreResourceUpdates` configuration, you can add | ||
`argocd.argoproj.io/apply-resources-update=true` annotation in the dependent resources manifest: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the concepts of tracked resources and watched resources here are mixed up. Also the user facing documentation is leaking the implementation.
## Tracking Dependent Resources | |
Dependent resources by default are not being tracked. Therefore, we cannot generate any hash of those objects and utilize | |
the `ignoreResourceUpdates` configuration. | |
If you want to track the dependent object and apply the `ignoreResourceUpdates` configuration, you can add | |
`argocd.argoproj.io/apply-resources-update=true` annotation in the dependent resources manifest: | |
## Ignoring updates for untracked resources | |
ArgoCD will only apply `ignoreResourceUpdates` configuration to tracked resources of an application. This means dependant resources, such as a `ReplicaSet` and `Pod` created by a `Deployment`, will not ignore any updates and trigger a reconcile of the application for any changes. | |
If you want to apply the `ignoreResourceUpdates` configuration to an untracked resource, you can add the | |
`argocd.argoproj.io/ignore-resource-updates=true` annotation in the dependent resources manifest. |
docs/operator-manual/reconcile.md
Outdated
restartPolicy: OnFailure | ||
``` | ||
|
||
Then you can update `argocd-cm` configMap to ignore the dependent resources: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you can update `argocd-cm` configMap to ignore the dependent resources: | |
The resource updates will be ignored based on your the `ignoreResourceUpdates` configuration in the `argocd-cm` configMap: |
docs/operator-manual/reconcile.md
Outdated
Note: If you set `argocd.argoproj.io/apply-resources-update: "false"`, no hash will be generated and `ignoreResourceUpdates` | ||
cannot be applied on those resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation details.
Note: If you set `argocd.argoproj.io/apply-resources-update: "false"`, no hash will be generated and `ignoreResourceUpdates` | |
cannot be applied on those resources. |
Signed-off-by: kahoulei <kahou.lei@okta.com>
@agaudreault I updated the doc. Thanks. |
Signed-off-by: kahoulei <kahou.lei@okta.com>
7c0577c
to
c29e0fb
Compare
controller/cache/cache.go
Outdated
|
||
// AnnotationIgnoreResourcesUpdate when set to true on an untracked resource, | ||
// argo will apply `ignoreResourceUpdates` configuration on it. | ||
AnnotationIgnoreResourcesUpdate = "argocd.argoproj.io/ignore-resources-update" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the name consistent. Make sure to validate all references
AnnotationIgnoreResourcesUpdate = "argocd.argoproj.io/ignore-resources-update" | |
AnnotationIgnoreResourceUpdates = "argocd.argoproj.io/ignore-resource-updates" |
Signed-off-by: kahoulei <kahou.lei@okta.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM. Waiting for a before/after graph on an instance with real load before merging
Signed-off-by: kahoulei <kahou.lei@okta.com>
Glad to see that this feature is being added. We have a case where we run large k8s clusters (~5k nodes) with tens of thousands of pods. Pod updates result in significant reconciliation activity and we are looking to leverage this feature. A quick question about this change - if let's say updates on |
@ronaknnathani the behavior is same as |
Discussed with @agaudreault about load testing. I don't have environment to test it currently. If someone know how to load test it, please leave a comment here so that we can merge it. |
@langesven awesome! thanks for your post! @crenshaw-dev ready to merge. Can you re-trigger the status check. It seems like they failed for an unrelated issue. |
Hello, we are impacted heavily by dependent resource-triggered reconciliations. We are observing thousands of pod related reconcile events - this is a serious blocker for future ArgoCD adoption for us as already at about 200 apps spread across a few dozen clusters the extraneous reconciliation events are causing ArgoCD to be very memory hungry. Argocd controller pods are getting OOMKilled with 12GB of RAM given to each of 6 pods. It'd be great if this can make it into v2.12. 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agaudreault do you think this rephrasing and the slightly clarified annotation key would be better?
…date (argoproj#18343) * feat: Add ignore-resources-tracking annotation to ignore resources update Signed-off-by: kahoulei <kahou.lei@okta.com> * add doc Signed-off-by: kahoulei <kahou.lei@okta.com> * update annotation doc Signed-off-by: kahoulei <kahou.lei@okta.com> * refactor annotation usage base on comment feedback Signed-off-by: kahoulei <kahou.lei@okta.com> * update annotation Signed-off-by: kahoulei <kahou.lei@okta.com> * do not store boolean in resourceInfo Signed-off-by: kahoulei <kahou.lei@okta.com> * typo Signed-off-by: kahoulei <kahou.lei@okta.com> * update logic Signed-off-by: kahoulei <kahou.lei@okta.com> * refactor Signed-off-by: kahoulei <kahou.lei@okta.com> * add comment Signed-off-by: kahoulei <kahou.lei@okta.com> * add tests Signed-off-by: kahoulei <kahou.lei@okta.com> * update doc Signed-off-by: kahoulei <kahou.lei@okta.com> * update code base on comment feedback Signed-off-by: kahoulei <kahou.lei@okta.com> * update annotation doc Signed-off-by: kahoulei <kahou.lei@okta.com> * fix goimport Signed-off-by: kahoulei <kahou.lei@okta.com> * fix golint Signed-off-by: kahoulei <kahou.lei@okta.com> * update comments Signed-off-by: kahoulei <kahou.lei@okta.com> * update docs Signed-off-by: kahoulei <kahou.lei@okta.com> * update annotation name Signed-off-by: kahoulei <kahou.lei@okta.com> * rename annotation Signed-off-by: kahoulei <kahou.lei@okta.com> * lint check Signed-off-by: kahoulei <kahou.lei@okta.com> --------- Signed-off-by: kahoulei <kahou.lei@okta.com> Co-authored-by: kahoulei <kahou.lei@okta.com> Co-authored-by: Ishita Sequeira <46771830+ishitasequeira@users.noreply.github.com> Signed-off-by: Rhys Williams <rhys.williams@electrum.co.za>
Any chance this feature can make it into v2.12? |
Hello, resource.customizations.ignoreResourceUpdates.all: |
jsonPointers:
- /status There is no way to ignore |
Yes. Validating this on all resources (and not only on the resources tracked by argo) will simply use too much CPU as most resource's watch event should not cause a reconcile. If they do, you might want to check if you have orphan resources enabled or implement a mutation webhook to add this annotations to all resources of a kind in case they are a bad actor. |
Hi. I have a question about the shouldHashManifest function. The way I understand it, the hashing will be done for resources that are tracked by Argo CD. If the resource is not tracked, Argo CD will use the value of the annotation From the discussion I can see that it was proposed that the annotation |
@lpugoy if the annotation is true, then it will be evaluated against the configured ignoreResourceUpdates, and based on what changed in the update, it may be ignored or not. Whether it is hashed or not is an implementation details that does not need to be surfaced to the end users (but we require to hash resources in order to evaluate the ignore difference). |
Thanks @agaudreault. Still don't fully understand but thanks for taking a second look. 👍 |
This is the enhancement request to allow user ignore any resource update.
Previously we have a feature
ignoreResourceUpdatesEnabled
which allow user to ignore certain part of the resource manifest. But that feature does not fulfill the use case of newly created object.When an dependent object is created, there is no old manifest to compare and therefore we always refresh in such case. This is wasting a lot of compute cycle if a cluster has a lot of dependent objects (e.g. jobs and pods created by cronjobs).
Instead, this PR introduces an annotation
argocd.argoproj.io/ignore-resources-tracking
so that user can use it on a specific resource and argocd can completely ignore it.Also, this annotation will only work behind the same
ignoreResourceUpdatesEnabled
flag.Checklist: