Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Provide seamless integration / removal for live systems #308

Closed
pkaramol opened this issue Sep 23, 2019 · 4 comments · Fixed by #384
Closed

Feature Request: Provide seamless integration / removal for live systems #308

pkaramol opened this issue Sep 23, 2019 · 4 comments · Fixed by #384

Comments

@pkaramol
Copy link

pkaramol commented Sep 23, 2019

Currently, when adding the needed configuration on a live system there needs to be a downtime.

(e.g. flagger will take over the Service and VirtualService resources and this will inevitably cause disruption).

Same is for when removing the flagger infrastructure from a live system.

(from some draft experimentation this seems to be even more complicated)

In case completely seamless addition / removal of flagger on a live production system is not possible, perhaps the documentation needs to be enhanced about:

  • the actual (or recommended) steps to add flagger to a live production system using kubernetes and a service mesh
  • the best practices to limit downtime to the extend possible (during addition/removal of `flagger)
  • the actual (recommended) steps to remove flagger from a live production system
@stefanprodan
Copy link
Member

stefanprodan commented Sep 23, 2019

Taking over a live system depends on the Istio objects you're using. Flagger uses a combination of ClusterIP services, Istio destination rules and an Istio virtual service. I see no clear path for this, I'm guessing that if you're using a single ClusterIP and VirtualService that have the same name as the Deployment, then Flagger can take over, otherwise it will result in Pilot conflicts.

@pkaramol
Copy link
Author

if you're using a single ClusterIP and VirtualService that have the same name as the Deployment, then Flagger can take over,

Having done some trials on this, I confirm it does take over.

I believe tho that there is a certain amount of downtime (perhaps because the previous objects are deleted and recreated?)

Not 100% certain however. I intend to carry out more trials on this and come back with additional info (if any, that is)

@stefanprodan
Copy link
Member

Flagger does not delete any objects, it modifies them in place.

@stefanprodan
Copy link
Member

stefanprodan commented Sep 23, 2019

As for removing a canary from a live system, this can be achieved with:

  • scale Flagger to zero
  • delete the ownerReferences from all objects owned by Flagger (primary deployment, ClusterIP services, Istio destination rules, Istio virtual service)
  • delete the canary object
  • scale up the main deployment (the one that Flagger refer to as canary)
  • edit the ClusterIP and VirtualService to whatever was before using Flagger
  • delete the -primary deployment, service and destination rule
  • scale Flagger back to one replica

PS. I think this could be automated with a script if the VirtualService can be restored to its previous state from a file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants