-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG 1875598: Ensure the Virtual Machine provider state is set to Unknown when Failed #696
BUG 1875598: Ensure the Virtual Machine provider state is set to Unknown when Failed #696
Conversation
@JoelSpeed: This pull request references Bugzilla bug 1875598, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
baseToPatch := client.MergeFrom(machine.DeepCopy()) | ||
|
||
if phase == phaseFailed { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this reuse the if block in line 429?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use the block from above, then it's before the baseToPatch
is created and as such, the patching mechanism thinks that the changes were there already. This modification has to come after baseToPatch
is created, but the other one has to come before it, because it is modifying the spec.
I added the line on 438 to try and make this more obvious
pkg/controller/machine/controller.go
Outdated
baseToPatch := client.MergeFrom(machine.DeepCopy()) | ||
|
||
if phase == phaseFailed { | ||
if err := r.patchFailedMachineProviderStatusState(machine); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is called patchFailedMachineProviderStatusState
but is not actually patching but just setting right?
We should be consistent with a patchFailedMachineInstanceAnnotation
and either let patchFailedMachineProviderStatusState
to create a baseToPatch and do its patch request or rename both funcs and let them just set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the naming is wrong, but I think the Instance annotation needs to stay as a patching function otherwise we have the issue of not actually updating the resource. The annotation needs to send a client.Patch
but the rest of the changes need a client.Status().Patch
because of the subresource.
I think the preferable route here is to just rename the new one not to include the patching word and leave the logic as is
This ensures that if the Machine goes into a Failed state, and the provider has already set the providerStatus to include an instanceState or vmState, that we override this with the value `Unknown`. This ensures consistency between the providerStatus and the instance state annotation.
6bdbf76
to
00d6df0
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: enxebre The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
/test e2e-aws-operator |
/test e2e-aws-operator |
1 similar comment
/test e2e-aws-operator |
/lgtm |
/retest Please review the full test history for this PR and help us cut down flakes. |
10 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/lgtm |
/retest Please review the full test history for this PR and help us cut down flakes. |
const instanceStateField = "instanceState" | ||
const vmStateField = "vmState" | ||
|
||
providerStatus, err := runtime.DefaultUnstructuredConverter.ToUnstructured(machine.Status.ProviderStatus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work correctly if ProviderStatus is nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a check on L#480 which prevents us getting this far if the providerStatus is nil, hence that scenario is avoided (and if the check is removed, the tests fail)
const vmStateField = "vmState" | ||
|
||
providerStatus, err := runtime.DefaultUnstructuredConverter.ToUnstructured(machine.Status.ProviderStatus) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this really prevent us from doing the rest?
if phase == phaseFailed { | ||
if err := r.overrideFailedMachineProviderStatusState(machine); err != nil { | ||
klog.Errorf("Failed to update machine provider status %q: %v", machine.GetName(), err) | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to fail the whole run here, we should log the error and continue processing. Nothing about this added behavior for setting subsequent status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a look through the kinds of errors that we could get returned from this function and I don't think in a running environment we would ever actually see them. To me, they all seem to be that they would exist only from programming errors (https://github.com/kubernetes/apimachinery/blob/94222d04a59075a01fddedd696037db9e61db6e9/pkg/runtime/converter.go#L404), digging into this, an error could come up if either the object passed is nil (Which is already being checked), or if the type of object cannot be converted there is a possibility for an error. In either of these cases we should see this during unit testing so I think this is better to leave as it is, else we may miss our programming errors as we make changes to this in the future
/hold |
/retest |
2 similar comments
/retest |
/retest |
/hold cancel |
/retest |
/retest Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@JoelSpeed: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
@JoelSpeed: All pull requests linked via external trackers have merged: Bugzilla bug 1875598 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This ensures that if the Machine goes into a Failed state, and the provider has already set the providerStatus to include an instanceState or vmState, that we override this with the value
Unknown
.This ensures consistency between the providerStatus and the instance state annotation.
Ref:
machine-api-operator/pkg/apis/vsphereprovider/v1beta1/vsphereproviderstatus_types.go
Line 64 in 184fb73
The OpenStack and Baremetal providers do not have an equivalent field.