-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Document best practises on working with multiple environments #1071
Comments
Hi, Based on the discussion on Weave Slack #flux with (slack users: marlin, mbridgen, hidde), Here's a summary of what we thought about documenting the best practices for managing multiple environments with flux and git. Architecture
Needs
Technical answers
FormatAs you can see, there are many questions to answer and they are not only "environment management" specifics. I think this is important to ask the main questions around the Git repository and the avoidance of code duplication first (without Helm or Ksonnet). The format can be:
I think the Git repo is the best way to start as we can quickly confront the theory and the reality. Everybody can view and elaborate the tutorial. When this repo will be stable, documentation and blog post can follow after. |
I have created a repo for housing the example: https://github.com/weaveworks/multienv-example |
I think the answer to this question will be different for helm and also will change with the next release of helm which makes it quite hard to handle. Scoping namespaces is an important part of the picture - here is a long discussion about that here - |
One of the challenges we are having in the multi-environment setup is the duplication of the same definition for multiple environments in different branches, and the only real thing that is changing is that master / qa are referring image different patterns for automated deploy. |
GitFlow example by Hidde Beydals on Slack: We have a CI pipeline that first does validation on the K8S manifests using kubeval ( (with the right specs configured for your K8S version) and then our custom validations (e.g. no 'latest' image, correct labels set, etc). If the pipeline fails one is not able to merge. We also require reviews from our ops team before one is able to merge. Do most people put their helm charts inside the pertinent repository or in an external repository with all charts which represent the entire system of microservices - any thoughts on this ? We keep the actual copy of our running chart in the same repository/branch as the cluster they're running on. Using git submodules or just replicating manually each change ? helm fetch --untar --untardir charts/ So the update frequency of the charts themselves isn't that high (or you don't always need the updates). We separate our clusters by environment, not by product. We have multiple products living in the same cluster env and the separation happens on branch level. So we have a repository Our staging environments for example do almost get no traffic, they're only there to verify configurations, test things, etc. We let it flow upwards.
|
Just for the record; I am the author of the Slack messages posted above and any questions regarding this approach are welcome. |
Would love to get updates on this - mostly with regards of understanding how other people are doing it and what works well/what doesn't. Maybe the flux team can give info on how they use it? :) Our current setupWe currently (pre gitops) have a simple structure of Our repo's typically manage multiple environments ( Interestedly enough tools like ksonnet/kustomize both lend themselves to this structure. Kustomize specifically has We currently use ansible to deploy, so we get to choose which inventory to use, and tie that to What about GitOpsAt the moment it's just creating a lot of questions for us, which are mostly related to how we manage different environments.
Not really looking for any answers here, just getting my thoughts down :) |
I noticed this the other day: https://kubectl.docs.kubernetes.io/pages/app_composition_and_deployment/structure_introduction.html It's not really specific to kustomize and explains various use-cases / directory and/or branch layouts that you may choose. Might be useful for anyone that has questions in this area. |
This is kind of discussion I have been searching for few days. I have a successful flux based CI/CD and would like to expand this to multi env by "artifact promotion" viz., develop -> staging -> pre-production -> production etc . While using Helm in the workflow is somewhat nice but, I want to achieve this w/o Helm. At the moment, our CI deploys the "master" image to a registry (with tag |
@tckb could you describe the production repo a little more? Is this completely separate repo, or a branch? My current flow is: Staging
Production
I'm not yet happy with the production flow as I don't really want to be creating git-tags that may not pass tests, hence maybe a production branch (and a PR) would be better (which might be what you're doing). |
@nabadger the staging workflow looks more or less the same. in the production workflow, I imagine to just create a PR with a approved build from staging and the production flux takes care of deploying it. I don't have a semver and production deployments are not automated. One has to request a PR with latest build -- this is the part I am working on at the moment. PS: I am in the process of drafting a workflow with multiple environments with flux and the workloads are not for production. |
@nabadger just finished it and have it working smoothly with GitOps. 😄 |
I am looking into implementing a multi-environment flux deployment flow. One thing I don't understand is how do you trigger integration tests to be run in a CI system that are required to be executed AFTER the deployment of an app has occurred? Since the CI system is not actually aware of the deployment existing/happening because flux follows a 'pull' model, it makes it difficult to know when CI integration tests should be triggered to start. Here is an example flow:
|
@jwenz723 FYI - I'm not a part of the project, just a user but I have integrated with Fluxcloud and there is a |
How does cleanup happen for step 5? I have a dev branch that was deployed; approved and now merged to staging... what happens to that environment? |
I have been looking into this too. My solution is to dedicate a couple of CI agents to integration test deploys. These CI agents do have access to the cluster, so you introduce an extra attack vector, but you get the benefit of being able to directly manage the lifecycle of these temporary deployments. Flux only garbage collects resources under it's control, so if you control your integration deploys outside of flux it won't interfere. |
Is my understanding correct that in a multi-environment flux requires a dedicated long-lived git branch that it syncs up with (e.g. staging, production)? Thus, for example, a deployment to production happens with a PR merge into the |
I think you have more options than just a branching strategy, here's a few options that should work:
There's a fair amount of flexibility. In my case I'm unlikely to use branches to control deployment to particular environments because the company I'm setting this up for prefers trunk based development. |
@grahamegee Thank you for sharing your ideas. It really helps and I truly appreciate it. We'd also prefer the trunk based development without long lived branches, but at this point, I am just not clear how to implement deployments to production once a commit in |
@demisx. In my initial response I misread some of what you said sorry! So in this edit I'm removing all the fluff about how I think flux works! I think you're right that you can't trigger a deploy directly from a git tag. In order to get flux to deploy from a SemVer release tag you would need a script/pipeline/developer to Build and push a docker image tagged with the SemVer after the commit has been tagged. You would also need to make sure you have a "production" manifest in your config repo which has a flux annotation that matches on SemVers. This "production" manifest will get updated by flux when the docker image is pushed. As you are using a monorepo (I assume your manifest files are also in there), you probably want to structure it such that all the manifest files are contained in a config sub directory and flux is configured to only monitor the sub directory. |
@grahamegee I think you are spot on. This is my understanding also how it works. I am going to try the different sub directories route. Once again, thank you very much for sharing your thoughts. It really helps. |
Yup I think that's a good plan! I'm likely to try the same thing. |
This is what I'm looking at now but would be great if there was an option for automation that creates a PR instead of just pushing to the "master" branch so that ops could approve the PR for deployment to production |
For what it is worth, my team has the requirement to deploy applications to multiple environments: sandbox, dev, test, staging, production (in that order). The number of required environments is due to the fact that we interact with legacy applications following legacy deployment strategies. We are ok with our code being deployed to sandbox, dev, and test clusters at the same time, so we treat all 3 of these environments as equivalent. Here is the strategy we use to accomplish our deployments:
Having a kustomize overlay per each environment provides us with the necessary flexibility to specify configuration parameters specific to each environment. Having 2 branches (staging and master) provides us with the necessary ability to test code that is placed into |
As requested by @2opremio on Slack:
|
These are not mutually exclusive if one uses e.g. kustomize. The setup is:
In each environment, flux is points to the corresponding git branch, and "kustomize build" uses the common base and the environment specific overlay directory. Git branches allow the common base to temporarily diverge. For example, if you introduce a new microservice in dev and add its manifests in base, clusters that track other git branches are not affected. Promotion is a simple git merge. If you now merge the modified dev into test, the new microservice is promoted there as expected. Changes made by the flux daemon (annotations, automatic image releases) are carried over in the merge as well. Changes specific to a single environment are made in the per environment overlay directory. This ensures clean git merges, and isolates changes to different environments. The setup is moderately complicated and takes some getting used to, but works well at least in my experience. |
@datacticapertti |
@Perdjesk thanks, that is interesting work. It is not immediately obvious to me how it would work in my case, so let me elaborate on why I ended up with my workflow. With kustomize you can put most of the configuration in a shared base, and the overlays only contain environment specific deltas. This is great. The only problem is that if all the environments track the same git branch, any modifications you do to base affect all the environments. Case in point, if want to have a new microservice in a dev environment, you would ideally add its manifests in base, and modify base/kustomization.yaml to include them. But if you do this, the new microservice would appear in all other environments as well. You can add the new microservice in the dev overlay, but then you need to copy the files and modify kustomization.yaml for the next environment when you promote to the next environment up. Things can only be moved to base once they have been promoted to all the environments, and then you need to refactor all the overlays at once. With the branch based approach this is not a problem, as you can modify base in the dev branch without affecting the other environments. Promotion is done with git merge, and modifications to base are carried over to the next environment. |
@datacticapertti what do you do for changes for overlays of staging and production? Something like an endpoint or something that you're adding to the overlay? Do you commit this to the dev branch, which does nothing and the merge request/pull request to staging? Similarly for production do you commit to dev -> stage -> master? Similarly for those "dev" only cases if you had certain things that run outside of dev like qa only do those get committed to dev and merged to qa? I can definitely see where the branching can help and hinder at the same time. |
@cdenneen unfortunately funding was pulled before we got to production, so I only have experience with two branches (dev and test). But in general, if you have something that you only want in one environment, you can put it in the overlay only and have nothing in base. As you always merge from a lower environment to a higher one, you end up with cumulatively more overlays. In dev there is only the dev overlay, in test there are overlays for dev and test, etc. I suppose one could prune them, for example only keep the overlay for the immediately preceding environment and git rm others. Overlays do need some coordination when doing a promotion with git merge. For example, a deployment manifest in base might want to mount an environment specific configmap created by an overlay. Git merge does not help here, you need to manually ensure that the configmap is indeed created. In practice the coordination is not too bad. You can use git diff to examine what changes are done by the git merge, and kdiff3 or similar to compare the overlays to look for things you may need to change manually. |
@datacticapertti right I was just saying if your workflow is something like: |
@datacticapertti I think this is what we're going to do and I think one small change in the fluxd args could solve a lot of the problems this approach brings up. If the eg. the develop cluster flux has Thoughts? |
If I followed along correctly it appears the current way to run tests after a sync (integration/smoke/etc) in an automated fashion would be to use FluxCloud(https://github.com/justinbarrick/fluxcloud). Have folks been successful with this? There are other proposed options such as Post Synchronisation Hook (#2696) which isn't implemented yet or it was suggested to use Flagger. Flagger looks to be designed for use in the production environment. Is anyone using in the staging environment to trigger tests before promotion to prod? |
FluxCloud won't (imo) be suitable for this, since it just knows what flux is doing, but not the state of the application in the cluster. What you really need is something running in-cluster that's monitoring applications. You need to know that it's rolled out before testing it. The chatops notification tools in this area this are fairly similar, i.e. they often tell you the deployment status, ready-pods, unavailable-pods etc. The problem I've seen with such tools is how opinionated they are (i.e. monitoring at the statefulset/deployment level, vs pod level, and differing code as to how you determine a ready-pod) Something like Flagger would be ideal - it gets quite difficult to chain together a set of tools to get this feature, it would be much nicer if there was just a couple of solutions to achieve it (flux + flagger). |
Do you have the env specific overlay only in the env specific branch or every env branch will have all the env overlays? |
Hi All,
|
Here are my thoughts off top of my head:
Yes, you'd need to run 3 different instances of Flux each mapped to the corresponding environment branch. Alternatively, you can place manifests into 3 different environment folders (dev/qa/prod) and use on instance of flux to sync up. We use the latter approach for simplicity.
I am not sure what you need these for, but if your environment uses a different SSH key, then I'd think you'd need to add them all. Sorry, it's hard to recommend anything here without knowing more about your environment and where exactly you use your public SSH keys for.
A single instance of flux can be mapped to one git branch only. |
Question: can multiple clusters point to the same config repo (and same branch) to deploy the same workloads identically? |
Thank you @demisx,
when you run this command: fluxctl identity --k8s-fwd-ns flux it will generate ssh, this ssh key we need to update on GitHub so that my flux can communicate with my repo, all i just wanted to know that do i need to add 3 ssh key's so that my 3 environments flux pods can communicate with my Dev/qa/prod branches |
Yes. Each installation of flux will generate a key. You need to add all the sshkeys (as a deploy key w/ write access) for its corresponding flux instance to be able to access the git repo. Flux will use the configured branch. (I wish there was a way to reply directly to a comment, so everyone doesn't see a notification/email for comment replies.) |
@cloudengineers you can also add a SSH key as a secret and configure flux to use that. That will allow you to use a single key if you want to. https://docs.fluxcd.io/en/latest/guides/provide-own-ssh-key/ |
I don’t see why not. |
Very interesting discussion. |
From memory, one of the patterns I was advised to use was a Flux Operator per application, as the problem I was trying to solve was how do you support Flux and GitOps in an application world where each application has its' own git repo vs an infra repo for a specific environment/cluster/role. I am yet to adopt this however given the lightweight resource footprint of Flux, it makes sense and keeps things sensible of doing one thing well. I also like the 1:1 of Flux <-> App Repo when it comes to changes and CI/CD. HTH. |
+1 |
Has anyone figured out a solution for flux to update a protected branch? |
The only way I am aware of is to grant the flux |
Here is an example of how to structure a gitops repository for multi-env deployments with Flux2: https://github.com/fluxcd/flux2-kustomize-helm-example |
Is there a way to have a folder that deploys to all clusters and is shared like in old flux? |
@shaneramey the https://github.com/fluxcd/flux2-kustomize-helm-example shows exactly that, the infrastructure dir is shared across all clusters. |
We've published two examples on how to structure repositories for Flux v2:
Closing this issue as Flux v1 is in maintenance mode. |
We should create a topic (and maybe a blog) on the best practises on using Flux in multiple environments: Test, staging, and production.
The text was updated successfully, but these errors were encountered: