Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing break the glass as a principle #38

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions GLOSSARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,13 @@ This glossary accompanies the [GitOps Principles](./PRINCIPLES.md), and other su
Git, from which GitOps derives its name, is the canonical example used as this state store but any other system that meets these criteria may be used.
In all cases, these state stores must be properly configured and precautions must be taken to comply with requirements set out in the GitOps Principles.

- ## Intermediate State Store
A system for storing a copy of the declarations that are mastered in the State Store. This system's purpose is intended to bridge the gap in availability between that of the State Store and the expected availability to make configuration changes to the Software System. The Intermediate State Store will offer an availability the same as or near enough to that of the users' expectations to update configuration in the Software System.
Where an Intermediate State Store is used, Reconciliation is used between the State Store and the Intermediate State Store and then again between the Intermediate State Store and the Software System.

- ## Feedback

Open GitOps follows [control-theory](https://en.wikipedia.org/wiki/Control_theory) and operates in a closed-loop. In control theory, feedback represents how previous attempts to apply a desired state have affected the actual state. For example if the desired state requires more resources than exist in a system, the software agent may make attempts to add resources, to automatically rollback to a previous version, or to send alerts to human operators.

- ## Break the Glass
The process of editing the Intermediate State Store directly in the event that a configuration update needs to be made to the Software System but the State Store is unavailable.
3 changes: 3 additions & 0 deletions PRINCIPLES.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,6 @@ The [desired state](./GLOSSARY.md#desired-state) of a GitOps managed system must

Software agents [continuously](./GLOSSARY.md#continuous) observe actual system state and [attempt to apply](./GLOSSARY.md#reconciliation) the desired state.

5. **Manageable "always"**

Desired state is able to be updated according to users' SLA expectations to update system state, even if the "source" is unavailable.
Comment on lines +24 to +26
Copy link
Contributor

@lloydchang lloydchang Oct 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grmhay Thank you for the pull request.

Sorry if I'm misunderstanding you but the PRINCIPLES.md section of this pull request...

Manageable "always"

Desired state is able to be updated according to users' SLA expectations to update system state, even if the "source" is unavailable.

... seems to assume that the source should be centrally managed or always managed with SLA expectations.

In the scenario that you described, it seems that GitHub is used for GitOps:

We (Morgan Stanley) believe that the situation where the source of truth for desired state (e.g. github.com or a git-equivalent that an enterprise may run) is less available than your users' expected SLA for making configuration changes is being left by the community as an issue for the implementer to overcome.
Put succinctly, if Github is unavailable and you want to make changes to your System State, there should be one approach and a set of tooling to allow reconciliation after the fact.
This will both harm adoption of gitops and is inefficient as I believe we shared a common challenge that we can solve once within the project.
The first step, as this project has so well established, is a glossary of terms to allow us to describe the problem and a draft principle to add. I have included these in this PR.

My concern is that the proposed principle, as written, seems to presuppose GitOps only running as a centralized system and always managed with an SLA.

While GitHub can be centrally managed with an SLA, Git isn't centrally managed at all.

The proposed principle, as written, seems to exclude non-centralized usages of GitOps, Git, Kubernetes, etc.

While GitOps doesn't require Git, I am listing Git below because you referenced Git earlier...

• Git, by design, is a distributed revision control system (DVCS), and not managed as a centralized system

Since we are discussing principles, which needs to be applicable in many scenarios... Centralized management wouldn't work in disconnected scenarios, such as:

• Kubernetes on fighter jets, e.g. https://www.cncf.io/blog/2021/09/30/how-to-get-robust-gitops-the-u-s-department-of-defense-uses-flux-and-helm/

• Kubernetes at in-store point of sales systems, e.g. https://www.cncf.io/blog/2021/02/19/how-a-4-billion-retailer-built-an-enterprise-ready-kubernetes-platform-powered-by-linkerd/

• Kubernetes in air-gapped environments, e.g. https://github.com/cncf/cnf-testsuite/blob/main/AIRGAP.md

• Kubernetes at the edge, e.g. https://www.cncf.io/blog/2021/05/04/kubernetes-at-the-edge-organizations-are-using-edge-technologies-but-there-is-room-to-grow/

While GitOps doesn't require Kubernetes, I listed Kubernetes in links above because Kubernetes is a CNCF project.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lloydchang. Appreciate your feedback and apologies for the delay in replying - Kubecon then a couple of days off. I work in a large enterprise without disconnected scenarios so it is great to collaborate with someone who has a different perspective! Reflecting on principle #3 "Software agents automatically pull the desired state declarations from the source." Our problem is if the the desired state in the "source" on the "state store" (usage of terms I believe per the Glossary) is less available than the desired SLA the users have to change the desired state of the "Software System", we have a problem.

Reflecting on most of the answers at GitOpsCon to this question that I put to end user organization presenters, this problem is either ignored ("well if Git/Gitlab/... is down, we can't make cluster changes") or unsolved and I believe that will end up in a bad place for GitOps.

I think actually with your example of the disconnected scenario, doesn't the problem I, in the enterprise, outline become even more acute? What happens if you are seeking to update the desired state of a Kubernetes cluster (example software system) but the "state store" is unavailable (e.g. WAN connection down to a branch office holding the cluster). You just can't change the cluster config? Or you break glass and change the cluster config then you are left to reconcile the desired state expression on the "state store" manually to what is on your cluster.

Note: I also have to fix my commits to have DCO signoff so I'll amend my commit based on your feedback and please continue the conversation against my new PR