-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compare generations when available #325
Conversation
ec2aa2a
to
efba100
Compare
@@ -24,7 +24,8 @@ def deploy_succeeded? | |||
end | |||
|
|||
def deploy_failed? | |||
pods.present? && pods.any?(&:deploy_failed?) | |||
pods.present? && pods.any?(&:deploy_failed?) && | |||
observed_generation == current_generation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this not be !=
? Or maybe I just misunderstood it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, and @KnVerey will hopefully correct me if I'm wrong, current_generation
gets bumped when we apply
the config. observed_generation
gets set by the controller reporting the status. When they don't match it means that we're looking at a status for the previous config not current. E.g. we don't want to say the deployment has failed or succeeded before the status reflects the current config's state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh! Thanks for the clarification
|
||
def observed_generation | ||
return -2 unless exists? | ||
# populating this is a best practice, but not all controllers actually do it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, do you have an example of a controller that doesn't populate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all objects have metadata generations, theoretically any controller could be setting observedGeneration, though this is only meaningful if the controller is writing a status. An example of a resource that has a status but no observedGeneration would be Job or CronJob. See also this k8s core issue. As a sidenote, in 1.11 (I think it made beta there?), CRs will have a reliable metadata.generation
we can use to do this properly in our own controllers. Our generic CR success feature should make use of it when available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the one question for clarification but the rest looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: is there a reasonable way to return early from sync
when observed_generation != current_generation
. Doesn't seem like we get anything useful when that happens.
What do you have in mind that we could skip? The basic sync is just |
Having to put this check in every class just makes me think we're missing an abstraction layer. The |
I think its better as is than adding a new state to unwind later. |
👍 I'll merge as-is then. I added a comment to your WIP so we don't forget about this extra state. |
Problem
I just observed a instant false positive success on a web deployment. Relevant lines:
The rollout definitely did not complete in less than 2s.
Checking the code made me notice that we aren't comparing
metadata.generation
tostatus.observedGeneration
before using the status to determine success. This means we might be making that determination with state data, which could explain the above behaviour.Solution
Compare
metadata.Generation
tostatus.observedGeneration
before_succeeded?
and_failed?
determinations whenever possible. I used the API documentation to figure out which controllers actually populate the observed generation.Since this is a controller race, I can't think of a way to make it happen for integration tests. I have added regression tests at the unit test level.
@Shopify/cloudx