-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trigger orphan collection on OutOfRange machine status #667
Trigger orphan collection on OutOfRange machine status #667
Conversation
5548f1e
to
01b31ee
Compare
@dkistner could you provide you review as well:) |
Merging as a new release is awaited by Core colleagues |
@AxiomSamarth could you provide you approving review and merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @himanshu-kun,
sorry for the delay. From my point of view this looks good.
But I would really like to have @AxiomSamarth opinion also on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
What this PR does / why we need it:
Till now orphan collection triggers on machine obj deletion or after 30min.
But to handle cases which arise due to eventual inconsistency in AWS , where multiple VMs can start backing a machine obj, in that case to stop machine-controller to reach the VM quota , orphan collection is now also triggered if machine object is updated with a
OutOfRange
error description.The update of machine obj to
OutOfRange
can only happen byGetMachineStatus
call , which is called only during a machine creation or deletion.So in summary , orphan collection will be triggered in three cases now
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
I haven't enqueued in the orphan queue if status contains OOR error now , but only if the machine obj status
earlier didn't have
OOR error and now have it.This is to deal with scenario where machine obj has OOR error already and any other status update event triggers. So if we enqueue just on presence of OOR error ,then orphan logic will be triggered almost all the time
Release note: