-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider critical-components-not-ready taint when machine joins first time #778
Consider critical-components-not-ready taint when machine joins first time #778
Conversation
@SimonKienzler Thank you for your contribution. |
Thank you @SimonKienzler for your contribution. Before I can start building your PR, a member of the organization must set the required label(s) {'reviewed/ok-to-test'}. Once started, you can check the build status in the PR checks section below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the gardener issue I understand that this taint is relevant only during the first time the machine becomes Running
. Current changes in the PR also lets the taint affect the health mechanism of MCM. Health mechanism is only relevant after the machines have become Running
once. So I have suggested changes , not to affect that.
Please let me know if you think otherwise.
@himanshu-kun thanks for your feedback, I gave it a go - please let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you also add 3 test cases in a different Context
block under ##General Machine Health Reconciliation
in pkg/util/provider/machinecontroller/machine_util_test.go
- "Pending machine which is healthy but node without taint shouldn't be marked Running"
- "Pending machine which is healthy and node has taint should be marked Running"
- "Unknown machine which is healthy , and node with the taint should be marked Running"
@himanshu-kun I added the test cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asked for one test case. otherwise looks good.
@SimonKienzler could you allow me editing rights to this PR. I have some changes which I'd like to make |
@himanshu-kun Unfortunately, I cannot (see https://github.com/orgs/community/discussions/5634). Can we make it work in another way? |
/assign |
@himanshu-kun I cherry-picked your commits. I think, this PR should be ready to merge :) |
We discussed internally and have a common concern that should the |
I would say this is necessary to prevent rolling updates from tumbling. Otherwise, mcm would continue the rolling update even if the customer workload cannot be scheduled on the node yet because it is not fully ready. From gardener/gardener#7117:
|
@himanshu-kun please tell me if there is anything missing to merge this PR :) |
No there isn't @timebertt . thanks for the explanation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
What this PR does / why we need it:
A node taint (
node.gardener.cloud/critical-components-not-ready
) is introduced in gardener/gardener#7406 as part of gardener/gardener#7117. This taint needs to be considered by the MCM when it checks the machine health, as a machine is only supposed to beReady
if the introduced taint is not present.Co-Authored-By: @timebertt
Which issue(s) this PR fixes:
Part of gardener/gardener#7117
Special notes for your reviewer:
The change has been tested locally in combination with with
machine-controller-manager-provider-local
in order to verify correct behaviour.Release note: