-
-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: agents are hang in status "still creating" #623
Comments
Just ran into this as well! The only error I can see from the logs is a timeout for the system-upgrade-controller:
|
Same here. It was working a few days ago |
I also have the same problem. |
Any chances to point old version? |
In version = "1.9.5" Or whichever was the latest that worked for you, after changing that run |
Pointing old version didn't work for me. |
I started digging through https://mirror.dogado.de/opensuse/tumbleweed/appliances/ and was able to find a mirror link that pinned the version.
I'm sure the break was more recent (~20230302?), but I just used the pin from the last time I successfully deployed. Can confirm that pinning the version will fix this specific issue. |
getting now remote executor timeout
|
Did you redeploy from scratch? |
Sure thing. Twice or trice, even deleted .terraform folder |
People, no need to look into MicroOS, all we need to do if SSH into the node, and execute the failing bash files manually, and see what's happening. Also looking at the logs via journalctl. Will do ASAP, keep you posted. |
Thanks for jumping in! as I posted in the beginning seems k3s agent unable to start because of "server is not ready: unable to find interface: route ip+net: no such netwo rk interface" in the logs of k3s agent" but tbh I haven't found any solution or reason for that |
@bulnv Exactly! The private interface name has changed, could be coming from Hetzner themselves, working on a backward compatible fix now. |
Alright folks, that's fixed as part of v1.9.7 just released now. I just renamed According to chatgpt, it's the newest linux kernels that are now working this way, so it should be permanent from now on. The interface also gets discovered automatically now, so we are able to drop a few lines too from the cloud-init! Enjoy 🚀 ✨ |
@mysticaltech Still, not working for me and can't dig deeper cause ssh has been blocked on the server side
|
@bulnv Just terraform destroy, and try again fresh, it will unblock SSH. Also, make sure to run |
@mysticaltech I did it. SSH was blocked because of my fault. I've logged in to the machine and found that the secondary IP is not assigned to eth1, that's why k3s cant start. Lets proceed in opened 626 issue |
@bulnv Yes, the name has changed now, what is it assigned to on your end? I expect it to be enp7s0, now if it gives eth1 or ens10, it will not work. So we will need to determine the name of the interface dynamically. And rename it with something like:
So that we can move back all config to use eth1, the thing is that the above command will not be permanent, we need to make it permanent. Right now, I am at work and cannot focus on that, I will come back to it tonight, please don't hesitate to send PR fixes. Otherwise, I will look into it more tonight ASAP. |
If you folks could ssh into your nodes (see readme) and run |
|
CPX31
|
@mysticaltech i guess for me and CPX nodes issue is resolved so far. Thanks a bunch for help |
We now support all kinds of interface names, it doesn't matter which comes up, we end up renaming it to eth1. |
@mysticaltech Thanks! I can confirm it is now working on CX series VM's with Intel CPU. |
Good to hear @tripadvisor101, thanks for the confirmation! |
Hi there, sorry for being late to the party, I work on a cached image of microos and didn't update kube-hetzner for a while, so had no issues. However, now I checked out the latest updates (kube-hetzner:1.10.0), still using my cached image of microos (from January), and guess what, networking all broken... Now on yesterday's microos and kube-hetzner:1.10.0, things work again. I would say that for stability reasons, each release of kube-hetzner should be explicitly tied to a version of microos. In reality it is tied, but we just don't keep track of it. I'm afraid that microos turns out to be quite a pain point. Its image gets updated every few days, without tracking versions and hence hardly an easy way to freeze the version your working with! The different releases of kube-hetzner only work with specific historic microos versions which are now mostly lost. @mysticaltech 's hard work to fix this (THANKS!), is not the first iteration of this cat-and-mouse game. What is our way out of this? |
Maybe Fedora CoreOS? Idk if it's been discussed before, but it's used by Redhat for Openshift |
@valkenburg-prevue-ch @aleksasiriski In one year, we had one breaking change of microOS that broke the deployment of new nodes, which was in fact due to an improvement in the networking stack. So it's a nonissue for me! It's actually excellent and has no versions, it's a rolling release based on tumbleweed. |
Description
Kube.tf file
Screenshots
No response
Platform
linux
The text was updated successfully, but these errors were encountered: