-
-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Error: Failed waiting for log message 'Wrote kubeconfig' from node 'k3d-testing-server-1' #315
Comments
Hi @Filius-Patris , thanks for opening this issue. Can you try to run the command again and abort it (e.g. Ctrl+C) after seeing |
For now I can just append them as complete files, if you find them helpful:
|
What do you mean with "similar"? I can see the log message that k3d is looking for in all three logs you attached 🤔 |
The error with 3 servers and 3 agents:
I tried debugging again, but this time, I tried to abort just before k3d starts rolling back the cluster. It took a few tries, but then I noticed that For some reason though This time the logs are extremely short: server-1:
server-2:
I will try to get the logs of server-0, but this might be more difficult and needs some scripts... |
Oh, btw, I just realized you can run the
|
Something seems to be really off there 🤔
Additionally, you could try cleaning up your docker environment (if possible):
|
Pulling the imageSure, works fine:
ResettingI went as far as removing every image, completely uninstalling k3d and Docker Desktop and reinstalling them. After reinstalling them, I ran the command and sure enough had the same error. I know that all images were deleted too, because there was the log message about pulling Getting more logsI was never able to get the logs of the aborted |
Doing some time math, lead me to the following maybe-clues:
That's where we get into source code. I'd like to hack k3d in the following way, but I don't know golang, so help is very much needed: |
So this line in the logs you've posted looks very suspicious The timeout you mention in your last comment is the context timeout that you can set with |
I have the same problem although I try to get k3d (version 3.0.1) running on a Linux machine with Ubuntu version 20.04.1 LTS as OS. My Docker-Version is exactly the same as here mentioned. Initially I got k3d working with 3 servers but after I deleted a cluster once it didn't work anymore. |
@rr-appadaptive do you also get the same logs? |
Could that mean that more than 3 nodes (e.g. 3 Servers and 2 Agents) would solve the problem. I'll going to try that... |
Still present in v3.0.1, just FYI. |
@Filius-Patris there was no related change in v3.0.1 and so far I also couldn't detect any issue within the k3d code. |
I have the same problem although I try to get k3d (version 3.0.1) running on a Linux machine with Ubuntu version 20.04.1 LTS
|
Ok, now THIS is concerning: I could make it reproducible in a DigitalOcean VPS.
What am I missing? How can this happen??? EDIT: When you reboot the machine, then launching a cluster works. What? |
@Filius-Patris I seriously have no clue, why you're experiencing this. EDIT: just installed kubectl afterwards to verify that the cluster is actually up: root@ubuntu-s-4vcpu-8gb-fra1-01:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k3d-testing-server-0 Ready master 3m8s v1.18.6+k3s1
k3d-testing-agent-0 Ready <none> 3m1s v1.18.6+k3s1
k3d-testing-agent-1 Ready <none> 3m v1.18.6+k3s1
k3d-testing-agent-2 Ready <none> 2m58s v1.18.6+k3s1
k3d-testing-server-1 Ready master 2m49s v1.18.6+k3s1
k3d-testing-server-2 Ready master 2m45s v1.18.6+k3s1 |
Is there any update how to fix this? I also experience the same issue...
|
@zasherif Could you 1.) Give some hints about your machine? (VPS/Mac/Ubuntu/whatever...) 2.) Try rebooting the machine? |
@Filius-Patris I use 18.04.1-Ubuntu, with 5.4.0 kerenel and Docker version 19.03.12, build 48a66213fe |
Another piece of information btw: The docker daemon logs (or in my case, terminal output, I started it manually) show an error in the docker engine:
Can't tell if this is cause or symptom, but I figured documenting doesn't hurt. |
I also reproduce the issue in two machines: k3d-v3.0.1 | ubuntu-20.04LTS | docker-ce 5:19.03.12 by running @iwilltry42 I can provide you with full root access to one of these machines (it is a VPS in OVH) for as long as you need. Just confirm if you want it. |
Hey 👋 @Filius-Patris I think the docker daemon logs you posted are just a "side-effect" of the actual problem: because we're failing to wait for the log message from a secondary master node, we're cancelling the overall used context and thus aborting all ongoing operations (like pulling the image for the loadbalancer and creating the other nodes). So far I couldn't reproduce this with versions lower than 1.19, so @rogeliodh , I'd definitely like to make use of your offer to have an environment where I can reproduce it 👍 Good for debugging this: I just released k3d v3.0.2 which includes a new |
Test run on @rogeliodh server:
I figured, that it will also fail sometimes, if you only create a single-server cluster (with the
I could imagine, that the environment comes into play here, as the CPU maxes out even when starting only a single server. SO I could imagine, that the leader election fails due to resource limitations, making the whole cluster crash 🤔 |
Hello ! Thx for your help on this bug It's the same for me with multiple server creation "--servers 3"
INFO[0000] Created network 'k3d-mycluster' I hope it's the same error and I hope it can help |
Hi @deromemont , thanks for adding your comment here :) |
This is a testable statement. So I went ahead and tested it. I have some spare credits on vultr, thus I used this provider. I made the following install script: #!/bin/sh
snap install docker && \
curl -s https://raw.githubusercontent.com/rancher/k3d/main/install.sh | bash && \
k3d cluster create testing --servers 3 --agents 3 && \
shutdown -h now Then I made the following VPSes and specified above as a start script:
This assertion might be correct, and if I can make another thesis on top: One CPU isn't enough. |
Thanks for your tests @Filius-Patris ! |
I've assigned Docker 4 CPUs and 8GB RAM and I get this error on 3 or more servers (
@iwilltry42 If you are right with your assumption, this is a bug in k3s isn't it? 🤔 Because k3s was designed exactly for edge cases and shouldn't crash when only limited resources are available. Or is it not related to k3s and merely a docker issue? Off-topic: |
You're right, that k3s shouldn't have an issue with limited resources, so I would assume that it's the additional layer (docker) here that introduces those issues. Though I cannot verify this right now..
Das ist richtig 😁 |
Any progress on this issue? |
@sdghchj If you could test the above scripts but don't install Docker via snap, but rather their PPA or something other native. Run the tests again and report if (what) you have different. That would be helpful! |
I juest meet same problem by command
It show Infinite output log
Docker version (docker desdktop for windows)
But i don't use |
Hi @zengzhengrong , seems like your issue is different (different cause). You may want to use the |
Is there any update on this? If not, I think we may want to close this issue as "stale" 🤔 |
AFAWK it was due to constrained hardware resources. Because k3d is designed for such environments, it might be sensible to document some minimal hw requirements (especially bc I couldn't run it on my dev machine, MB Air 17). Dear future reader: Upgrade your hardware / rent a server with more beef. |
Good disclaimer 😅 |
Yeah, I ran it on Docker Desktop. I have reinstalled it since, but by default it only allocates 2GiB of RAM to the VM running Docker. The rather recent RasPis suffice the (supposed) requirement of 2 cores and 4GiB RAM. btw, I've just prepared a PR for the docs. If further discussions arise, we could discuss them there. |
What did you do?
How was the cluster created?
k3d cluster create testing --servers 2 --agents 3
What did you do afterwards?
What did you expect to happen?
Some info messages similar to the demo video on k3d.io and success.
Screenshots or terminal output
Recording on Asciinema
Which OS & Architecture?
MacOS Catalina (10.15.5)
Which version of
k3d
?(installed via homebrew)
Which version of docker?
The text was updated successfully, but these errors were encountered: