Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement progressive promotion #593

Merged
merged 4 commits into from
May 18, 2020
Merged

Implement progressive promotion #593

merged 4 commits into from
May 18, 2020

Conversation

stefanprodan
Copy link
Member

@stefanprodan stefanprodan commented May 15, 2020

This PR adds a new field to the Canary spec analysis.stepWeightPromotion. When stepWeightPromotion is specified, the promotion phase happens in stages, the traffic is routed back to the primary pods in a progressive manner, the primary weight is increased until it reaches 100%. This way the HPA has time to scale up the primary replicas and scale down the canary ones.

Fix: #381

For testing:

# update CRDs
kubectl apply -f https://raw.githubusercontent.com/weaveworks/flagger/progressive-promotion/artifacts/flagger/crd.yaml

# replace Flagger image
kubectl -n istio-system set image deployment/flagger \
flagger=stefanprodan/flagger:prom-weight.1

@codecov-io
Copy link

codecov-io commented May 15, 2020

Codecov Report

Merging #593 into master will decrease coverage by 0.05%.
The diff coverage is 40.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #593      +/-   ##
==========================================
- Coverage   55.05%   55.00%   -0.06%     
==========================================
  Files          62       62              
  Lines        5278     5303      +25     
==========================================
+ Hits         2906     2917      +11     
- Misses       1951     1959       +8     
- Partials      421      427       +6     
Impacted Files Coverage Δ
pkg/controller/scheduler.go 44.63% <40.00%> (-0.04%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0056b99...be96a11. Read the comment docs.

@maruina
Copy link

maruina commented May 15, 2020

Hey @stefanprodan, I was able to test it and it works really well. Thank you!

This is the Canary spec

spec:
  analysis:
    interval: 1m
    maxWeight: 50
    metrics:
    - interval: 1m
      name: request-success-rate
      thresholdRange:
        min: 99
    stepWeight: 10
    stepWeightPromotion: 5
    threshold: 10

those are the events

Events:
  Type     Reason  Age                   From     Message
  ----     ------  ----                  ----     -------
  Warning  Synced  21m                   flagger  java-flagger-primary.maersk not ready: waiting for rollout to finish: observed deployment generation less then desired generation
  Normal   Synced  20m                   flagger  Initialization done! java-flagger.maersk
  Normal   Synced  16m                   flagger  New revision detected! Scaling up java-flagger.maersk
  Normal   Synced  15m                   flagger  Starting canary analysis for java-flagger.maersk
  Normal   Synced  15m                   flagger  Advance java-flagger.maersk canary weight 10
  Normal   Synced  14m                   flagger  Advance java-flagger.maersk canary weight 20
  Normal   Synced  13m                   flagger  Advance java-flagger.maersk canary weight 30
  Normal   Synced  12m                   flagger  Advance java-flagger.maersk canary weight 40
  Normal   Synced  11m                   flagger  Advance java-flagger.maersk canary weight 50
  Normal   Synced  10m                   flagger  Copying java-flagger.maersk template spec to java-flagger-primary.maersk
  Normal   Synced  29s (x10 over 9m29s)  flagger  (combined from similar events): Advance java-flagger.maersk primary weight 100

and this is what I see in my graphs

image

(note that I used a small stepWeightPromotion just to make it very evident on the graphs)

@stefanprodan
Copy link
Member Author

Thanks @maruina for testing this 👍

Kubernetes events get compacted so the best way to monitor Flagger is by tailing the logs with jq:

kubectl -n istio-system logs deploy/flagger -f | jq .msg

@maruina
Copy link

maruina commented May 15, 2020

Better, thanks.

"Starting canary analysis for java-flagger.maersk"
"Advance java-flagger.maersk canary weight 10"
"Advance java-flagger.maersk canary weight 20"
"Advance java-flagger.maersk canary weight 30"
"Advance java-flagger.maersk canary weight 40"
"Advance java-flagger.maersk canary weight 50"
"Copying java-flagger.maersk template spec to java-flagger-primary.maersk"
"Advance java-flagger.maersk primary weight 55"
"Advance java-flagger.maersk primary weight 60"
"Advance java-flagger.maersk primary weight 65"
"Advance java-flagger.maersk primary weight 70"
"Advance java-flagger.maersk primary weight 75"
"Advance java-flagger.maersk primary weight 80"
"Advance java-flagger.maersk primary weight 85"
"Advance java-flagger.maersk primary weight 90"
"Advance java-flagger.maersk primary weight 95"
"Advance java-flagger.maersk primary weight 100"
"Promotion completed! Scaling down java-flagger.maersk"

Looking forward to the next release :)

Copy link
Collaborator

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks @stefanprodan

pkg/controller/scheduler.go Outdated Show resolved Hide resolved
pkg/controller/scheduler.go Outdated Show resolved Hide resolved
@stefanprodan stefanprodan merged commit f5a3b9d into master May 18, 2020
@stefanprodan stefanprodan deleted the progressive-promotion branch May 18, 2020 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Progressive promotion
4 participants