Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to pause syncing an application #4808

Closed
jsoref opened this issue Nov 10, 2020 · 23 comments
Closed

Provide a way to pause syncing an application #4808

jsoref opened this issue Nov 10, 2020 · 23 comments
Labels
enhancement New feature or request

Comments

@jsoref
Copy link
Member

jsoref commented Nov 10, 2020

Summary

Provide a way to pause syncing an application

Motivation

I have an application which is broken. I can delete the application in argocd and then argocd will recreate it. Currently that just recreates it in its broken state. I'd like to be able to tell argocd to "pause" its syncs for this application, let me delete the application, and then after I fix the underlying state (possibly by deleting some PVCs), I would then "unpause" it and let argocd recreate the application properly.

Proposal

An option in app details?

@jsoref jsoref added the enhancement New feature or request label Nov 10, 2020
@jannfis
Copy link
Member

jannfis commented Nov 10, 2020

Isn't this basically achievable by just turning off auto sync?

@jsoref
Copy link
Member Author

jsoref commented Nov 10, 2020

I think that autosync in my case is controlled by the parent application which means that if i try to turn it off, when the parent syncs, it's overridden, although i could be wrong.

Certainly, if I try to change the branch of a child application then when the parent application syncs it overwrites the branch.

@jsoref
Copy link
Member Author

jsoref commented Nov 10, 2020

Note:

  1. If the behavior of these two things is different (i.e. that one can safely change one but not the other), then the UI should make that clear.
  2. The UI should really make it clear that fields (e.g. branch on a child application) will be overwritten on the next sync of their parent.

@jessesuen
Copy link
Member

I agree with @jannfis that this just sounds like auto-sync enabled/disabled.

I think that autosync in my case is controlled by the parent application which means that if i try to turn it off, when the parent syncs, it's overridden, although i could be wrong.

Wouldn't you just have the same problem with a new pause field?

@jsoref
Copy link
Member Author

jsoref commented Nov 10, 2020

It would have to not be something which is controlled by the versioned content -- a strict local overlay.

@vikas027
Copy link

vikas027 commented Jul 9, 2021

Isn't this basically achievable by just turning off auto sync?

Yes, it is but it is still useful to have a maintenance mode which just stops all syncing across the cluster.

@renaudguerin
Copy link
Contributor

Isn't this basically achievable by just turning off auto sync?
Yes, it is but it is still useful to have a maintenance mode which just stops all syncing across the cluster.

Exactly. Very interested in this feature too, as we have ArgoCD managed by itself and we have to go back to the root manifests and commit things to git whenever we need to temporarily disable selfHeal in an Application several levels below.

@agaudreault
Copy link
Member

agaudreault commented Jan 21, 2022

I think the title of this question is misleading 😅 Seems related to #3039 and #2913 (comment)

Current behavior
I have the same use case, I also tried to configure diffing on the parent application, with the code below, however, if the parent application detect that the app is OutOfSync for any other reason or if a manual sync is done on the parent application, then it will re-apply what is configured in the source control.

Current workaround
The only way to turn off automated sync when the app is managed as code, is to do a PR and update the source code.

Problematic
We would need another way because disabling it in the UI would be crucial for urgent operations, such as someone on-call in the middle of the night that cannot update the source control due to branch protection in place.

Expected behavior
ArgoCD would need to sync the Application (child) from the source control, but keep the spec.syncPolicy set on the resource if it was set by the UI.

Config on parent app:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: application-sync-parent
spec:
  # ...
  ignoreDifferences:
    - group: argoproj.io
      kind: Application
      jsonPointers:
        - /spec/syncPolicy/automated
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Automated sync enabled by default on child app:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: child-app
spec:
  # ...
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

@leoluz
Copy link
Collaborator

leoluz commented Jan 26, 2022

@agaudreault-jive wrote:

I have the same use case, I also tried to configure diffing on the parent application, with the code below, however, if the parent application detect that the app is OutOfSync for any other reason or if a manual sync is done on the parent application, then it will re-apply what is configured in the source control.

This will be addressed by https://blog.argoproj.io/new-sync-and-diff-strategies-in-argocd-44195d3f8b8c

@leoluz
Copy link
Collaborator

leoluz commented Jan 26, 2022

@jsoref @vikas027 @renaudguerin

Have you considered maybe using sync-windows to achieve this disabling sync behavior??
https://argo-cd.readthedocs.io/en/stable/user-guide/sync_windows/

@agaudreault
Copy link
Member

agaudreault commented Jan 26, 2022

@leoluz I havent, It seemed too hack-ish to use the sync windows as a process to disable the sync during emergencies.
I will try to add the following configuration when I have the time to test master locally or when 2.3 is released.

# Allow the Application sync behavior to be overwritten in ArgoCD UI or CLI
resource.customizations.ignoreDifferences.argoproj.io_Application: |
      managedFieldsManagers:
        - argocd-server
      jsonPointers:
        - /spec/syncPolicy/automated

This seems like it could work, and it will be applied only for the automated fields

@agaudreault
Copy link
Member

Restarting this discussion because it needs to be implemented in ApplicationSet as well and it has now been merged in this repo.

argoproj/applicationset#542

@rishabh625
Copy link
Contributor

@agaudreault-jive : You mean to say pause syncing an application feature or presence of ignoreDifferences field is not implemented in appplicationset?

and yaml mentioned in #542 quite works

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: example
spec:
  generators:
    - clusters: {}
  template:
    metadata:
      name: 'example.{{name}}'
    spec:
      destination:
        namespace: argocd-e2e
        name: '{{name}}'
      project: default
      source:
        path: apps/example
        repoURL: https://github.com/user/repo
        targetRevision: HEAD
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
         - RespectIgnoreDifferences=true
      ignoreDifferences:
        - group: argoproj.io
          kind: Application
          managedFieldsManagers:
            - argocd-server
          jsonPointers:
            - /spec/syncPolicy/automated

@agaudreault
Copy link
Member

@agaudreault-jive : You mean to say pause syncing an application feature or presence of ignoreDifferences field is not implemented in appplicationset?

pause syncing an application created/managed by an ApplicationSet.

It actually does not and it is listed in the current limitations.
Since last time I checked the documentation, an issue has been opened argoproj/applicationset#186

The presence of the ignore difference in ApplicationSet is another limitation, but the implementation would be one way to fix this issues

@rishabh625
Copy link
Contributor

ohk got it sorry I overlooked actual issue and thought it as yaml deployment issue, looks like a must feature, although links mentions a limitation arises by design but I too feel it would be a good feature

@sondrelg
Copy link
Contributor

I have a use-case that I don't see mentioned yet.

We have an automated build pipeline that will build, tag, and push docker images to a registry when code is merged into the main branch of our project. We also have an argocd image watcher process monitoring our registry for new image versions. When it finds one, it updates our deployment manifest image tag. This works well, and for our biggest project we'll manually sync changes a few times per day.

Every now and then we'll have an issue flagged in staging, so we know that syncing the same image to production will introduce an issue. Suddenly it can be critical that no one manually syncs prod.

The easiest way to resolve the problem is to push a fix and wait for a new image to build, but that can take upwards of 30 minutes.

Today we're relying on communicating issues like these using slack, but that's not a great solution. It would be much more useful for us to be able to instead temporarily disable, or lock syncing for a project until the issue is resolved.

@jannfis
Copy link
Member

jannfis commented Mar 10, 2023

@sondrelg I think that could be easily accomplished by configuring a blocking sync window in these cases.

@sondrelg
Copy link
Contributor

sondrelg commented Mar 10, 2023

Am I correct in my interpretation here?

It looks like using windows would mean we would first take a guess at how long it will take us to resolve the issue. We might say "let's freeze syncs for 30 minutes" and create a window with ~:

  - kind: deny
    schedule: '37 13 * * *'  # outage started 13:37
    duration: 30m  # we think it will be resolved in 30 minutes
    applications:
    - '<our app>'

But then if our fix is delayed, we would need to "top up" the window by extending or recreating it, right? And If we do resolve it in the 30 minutes and someone decides to just let the window expire and not clean up the window, then tomorrow at 13:37 syncs will be blocked again for 30 minutes?

Is that right? 🙇

@joebowbeer
Copy link
Contributor

joebowbeer commented Mar 10, 2023

I assume you can assign a deny-all window and then remover it later.

Or you can assign multiple allow windows for the same effect? 😄

@sondrelg
Copy link
Contributor

sondrelg commented Mar 10, 2023

true, maybe a 0 0 * * * duration 24 hour window would work

@mmerrill3
Copy link
Contributor

mmerrill3 commented Feb 17, 2024

There are two real world experience I have seen where having a general "stopped" status, and the ability for easily managing that status, would be helpful.

  1. Power management state in Azure AKS. When an AKS cluster has a stopped power state, the API endpoint is not available. All applications in argoCD that are deployed on that target cluster will now show Unknown status. For observability and metrics, it would be great to just have the ability to switch a toggle on the cluster that it is stopped, and those applications on the cluster would be effectively paused.
  2. EKS clusters that have no compute instances. This can arise given that EKS doesn't have the same "power management" ability as AKS in Azure. Our users sometimes remove all compute from an EKS cluster to do a poor-man's version of stopping a cluster. In argoCD, the applications on the target cluster will show as OutOfSync, and depending upon configured intervals, this can be a big CPU issue. It would be nice to have the same ability as the first point, where if the application is on a "stopped" cluster, syncing would be skipped.

Using windows is an option too. Maybe it would be helpful to expose the ability to toggle a cluster's "power state", and the underlying implementation in argoCD would be to create a deny window for the cluster on the cluster's app project?

@agaudreault
Copy link
Member

agaudreault commented Feb 20, 2024

I think this issue has been resolved in #14743
See https://argo-cd.readthedocs.io/en/stable/operator-manual/applicationset/Controlling-Resource-Modification/#allow-temporarily-toggling-auto-sync for documentation when using an ApplicationSet. Please reopen if it is not the case.

If using the apps-of-apps pattern, the basic https://argo-cd.readthedocs.io/en/stable/user-guide/diffing/#application-level-configuration can be used.

@sondrelg & @mmerrill3, you are pointing out another use-case that differs from the initial issue. Can you check if a specific issue for this use-case of suspending a cluster already exists, and if not, create a new issue?

Thanks

@sondrelg
Copy link
Contributor

My use-case was solved above, so no worries 👍 Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests