Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching between canary and k8s rollout with no downtime #388

Closed
tega90 opened this issue Dec 2, 2019 · 11 comments · Fixed by #495
Closed

Switching between canary and k8s rollout with no downtime #388

tega90 opened this issue Dec 2, 2019 · 11 comments · Fixed by #495
Labels
question Further information is requested

Comments

@tega90
Copy link

tega90 commented Dec 2, 2019

Is it possible to switch between canary and k8s rollout with no downtime?

Currently, when I delete Canary object, my original deployment is left with 0 replicas.
I use Helm for deployment and Linkerd as service mesh

@stefanprodan
Copy link
Member

You can disable the canary rollout by setting skipAnalysis to true inside the canary spec. The canary deletion is handled by Kubernetes GC, you need some kind of script to make the removal work without downtime, I’ve explained the removal process here #308 (comment)

@stefanprodan stefanprodan added the question Further information is requested label Dec 2, 2019
@n0rad
Copy link
Contributor

n0rad commented Dec 5, 2019

I'm also facing the same problem, while I'm getting around -primary to not primary with helm templating on resources based on if the canary resource is to be deployed or not, I have an issue with the original deployment containing replicas: 0 overriding the HPA.

I cannot found where it's happening in flagger's code but to me if flagger is doing so when the canary resource is deployed, it should remove the replicas: 0 when the resource is removed.

@tega90
Copy link
Author

tega90 commented Dec 10, 2019

@n0rad Helm3 might help you with that problem because of it's 3-way Strategic Merge Patches. https://helm.sh/docs/faq/

@n0rad
Copy link
Contributor

n0rad commented Dec 10, 2019

thx @tega90, indeed helm3 will simplify the templating on the -primary and we are looking for it's support in helm-operator.

I'm still wondering what to do with the replicas: 0.

The process describe in #308 does not look friendly to me and I do not expect anyone here @blablacar to remember to do manual steps while wanted to remove canary on a live system. Especially since we are using gitops and any other resource deletion only needs removing the file from git.

@n0rad
Copy link
Contributor

n0rad commented May 25, 2020

I just tested 1.0.0-rc.5 on gke 1.15.11 and removing canary resource still leave the deployment with a replicas: 0 on kube deployment

@n0rad
Copy link
Contributor

n0rad commented May 25, 2020

I don't know if it's linked but the deployment I'm using do not have an initial replica value and is using an HPA

@n0rad
Copy link
Contributor

n0rad commented May 26, 2020

I tested with a replicas value without HPA and it's the same. I think this issue should be re-open

@stefanprodan
Copy link
Member

@n0rad have you enabled the finalizers? https://docs.flagger.app/usage/how-it-works#canary-finalizers

@n0rad
Copy link
Contributor

n0rad commented May 27, 2020

thx @stefanprodan, I thought it would be the default behavior, why it's not ?

I did tests with podinfo and it's working ok with few changes to the chart : #595

@stefanprodan
Copy link
Member

I thought it would be the default behavior, why it's not ?

Because finalizers can have grave side-affects, for example, if Flagger is down or it has been removed from the cluster, deleting a namespace with canaries will block forever. You shouldn't be creating services since Flagger does it for you. The finalizers are meant to be used while you're evaluating Flagger, on the long run you should remove the ClusterIPs from your charts and let Flagger bootstrap the app instead of patching existing objects.

@n0rad
Copy link
Contributor

n0rad commented May 27, 2020

The finalizer can be enable with a flag, so it's ok for me.

let's talk about the global lifecycle in #595, especially since this ticket is closed and working as the doc is expected it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants