-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover from previous incomplete cluster creation (id_rsa: no such file or directory) #8824
Comments
BTW, this particular issue was seen twice: #8821 and another time. Logs for the second time follow: |
this probably have to move to important long term. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
If container creation fails, as in #8814 - when we attempt to recreate the container to recover, we are unable, as we see that the container exists, and then fail because local SSH keys are missing:
! StartHost failed, but will try again: provision: Error getting config for native Go SSH: open /Users/tstromberg/.minikube/machines/stress8d5b4/id_rsa: no such file or directory
NOTE: --delete-on-failure does recover from this, but this is not the default.
Here's what my suggestion boils down to, roughly:
When starting up a cluster store a signal that the cluster stage is initializing. This may go along with Add transient states ("stopping", "starting") #8730.
During subsequent startups, check for the signal you've dropped. If it doesn't reflect a cluster that survived initialization, delete it by default. You could use the existence of SSH keys (for non-none drivers) as an initial or additional signal for this.
The text was updated successfully, but these errors were encountered: