-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 openstackmachine: do not set transient error message and reason #1301
🐛 openstackmachine: do not set transient error message and reason #1301
Conversation
✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
9285559
to
cb34674
Compare
f1cab55
to
59e7bc5
Compare
59e7bc5
to
109b017
Compare
109b017
to
81c9126
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also often get authentication failures in our osm.status.failureMessages
, e.g.:
status:
addresses:
- address: 10.30.3.242
type: InternalIP
failureMessage: 'OpenStack instance cannot be created: get server list: Successfully
re-authenticated, but got error executing request: Authentication failed'
failureReason: UpdateError
instanceState: ACTIVE
ready: true
WDYT about catching authentication errors for all cases and reconcile them again?
It seems that only error codes >= 500 are considered retryable errors (as of now). cluster-api-provider-openstack/pkg/utils/errors/errors.go Lines 26 to 33 in 505b5d2
I'd say that E.g.
I think it is better to make the OpenStack calls via gophercloud retryable then filtering when setting the failureMessage/failureReason. I'm no expert about OpenStack error codes and how gophercloud handles them - hopefully someone else can support me in this. |
sorry @mdbooth / @jichenjc for direct mentioning. I'm fine with the proposed solution from @seanschneeweiss regarding extending (FYI: this issue is currently the only one which is marked for milestone 0.6.4) |
@bavarianbidi Let's discuss the modification of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great cleanup, thanks. Comments inline. Push at your discretion.
/lgtm The /hold is in case you want to address the review comments. Feel free to remove it. |
I just want to say thank you for working on this @seanschneeweiss and reviewers! We are really looking forward to this feature. |
81c9126
to
d28b391
Compare
@mdbooth thank you for the great review. All suggestions were added. Would be great if you could check again. |
@seanschneeweiss sure, i'm totally fine with that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Agree with the nit, but it's not user facing so really not very important. No need for re-review if you decide to fix it!
Signed-off-by: Sean Schneeweiss <sean.schneeweiss@mercedes-benz.com>
d28b391
to
b9689c6
Compare
/hold cancel |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lentzi90, mdbooth, seanschneeweiss The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
1 similar comment
/retest |
/lgtm |
I think we should cherry pick this since we have it in the v0.6.4 milestone. |
@lentzi90: #1301 failed to apply on top of branch "release-0.6":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Oh that was not so straight forward |
What this PR does / why we need it:
With this PR we set
failureReason
&failureMessage
of the OpenStackMachine for the following three cases:Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1116
Special notes for your reviewer:
failureReason
&failureMessage
to have a choice when setting thefailureReason
compared to just setting UpdateMachineError for everything.TODOs:
/hold
Sean Schneeweiss sean.schneeweiss@mercedes-benz.com, Mercedes-Benz Tech Innovation GmbH, Provider Information