Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] node-label not working #681

Closed
logileifs opened this issue Jul 20, 2021 · 8 comments · Fixed by #584 or #598
Closed

[BUG] node-label not working #681

logileifs opened this issue Jul 20, 2021 · 8 comments · Fixed by #584 or #598
Assignees
Labels
bug Something isn't working
Milestone

Comments

@logileifs
Copy link

What did you do

  • How was the cluster created?

    • k3d cluster create test-cluster -a 1 --label 'foo=bar@agent[0]'
  • What did you do afterwards?

    • kubectl get node k3d-test-cluster-agent-0 --show-labels

What did you expect to happen

I expected label foo=bar to be there

Screenshots or terminal output

$ kubectl get node k3d-test-cluster-agent-0 --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k3d-test-cluster-agent-0 Ready 20s v1.21.2+k3s1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=k3s,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k3d-test-cluster-agent-0,kubernetes.io/os=linux,node.kubernetes.io/instance-type=k3s

Which OS & Architecture

  • MacOS amd64

Which version of k3d

k3d version v4.4.7
k3s version v1.21.2-k3s1 (default)

Which version of docker

docker version
Client:
Cloud integration: 1.0.17
Version: 20.10.7
API version: 1.41
Go version: go1.16.4
Git commit: f0df350
Built: Wed Jun 2 11:56:22 2021
OS/Arch: darwin/amd64
Context: desktop-linux
Experimental: true

Server: Docker Engine - Community
Engine:
Version: 20.10.7
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: b0f5bc3
Built: Wed Jun 2 11:54:58 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.6
GitCommit: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc:
Version: 1.0.0-rc95
GitCommit: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client:
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
compose: Docker Compose (Docker Inc., 2.0.0-beta.4)
scan: Docker Scan (Docker Inc., v0.8.0)

Server:
Containers: 3
Running: 3
Paused: 0
Stopped: 0
Images: 14
Server Version: 20.10.7
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.10.25-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 15.64GiB
Name: docker-desktop
ID: KRBC:JFMO:5JBX:DZZW:WCQU:NYET:WWGQ:U3CT:7IJS:DBZZ:LYNS:JZG7
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

@logileifs logileifs added the bug Something isn't working label Jul 20, 2021
@logileifs logileifs changed the title [BUG] [BUG] node-label not working Jul 20, 2021
@iwilltry42
Copy link
Member

Hi @logileifs , thanks for opening this issue!
That's actually expected, as the --label flag sets runtime labels on the node containers, not Kubernetes labels.
It's mentioned in the help text, but I understand the confusion as the label concept is the same (we're applying it on docker level though):

  -l, --label KEY[=VALUE][@NODEFILTER[;NODEFILTER...]]                 Add label to node container (Format: KEY[=VALUE][@NODEFILTER[;NODEFILTER...]]
                                                                        - Example: `k3d cluster create --agents 2 -l "my.label@agent[0,1]" -l "other.label=somevalue@server[0]"

However, what you want to achieve here is already implemented and will be dropped in the v5 release next month 👍

In the meantime, you can use --k3s-agent-arg "--node-label=foo=bar".. but that only works on all nodes together 🙄
Stay tuned for the next release :)

@logileifs
Copy link
Author

@iwilltry42 Ahh, I see. Pardon my confusion, definitely looking forward to that release and thanks for a great product 👍

@logileifs
Copy link
Author

I saw you had a new dev release and I was trying out the example you have there: k3dv5 cluster create --agents 2 --k3s-node-label "my.label@agent[0,1]" --k3s-node-label "other.label=somevalue@server:0" but I just get an error FATA[0000] Failed to parse node filters: invalid format or empty subset in 'agent[0,1]'
Is that to be expected since it's only a dev release or should that feature be working there ?

@iwilltry42
Copy link
Member

@logileifs I was just going to ask you to try https://github.com/rancher/k3d/releases/tag/v5.0.0-dev.0 😁
The feature works there, but the syntax of nodefilters changed (and is not yet updated in the help text): k3dv5 cluster create --agents 2 --k3s-node-label "my.label@agent:0,1" --k3s-node-label "other.label=somevalue@server:0" (apparently the second flag was correct already 😬)

@logileifs
Copy link
Author

I tried it out and it didn't work out so well:

$ k3dv5 cluster create --agents 2 --k3s-node-label "node-role.kubernetes.io/worker=worker@agent:*" --k3s-node-label "other.label=somevalue@server:0"
INFO[0000] Prep: Network
INFO[0004] Created network 'k3d-k3s-default' (8257a2c4c33bd6f143f29b36a6ec683cdae8e103b48630b83c235a54f61e8275)
INFO[0004] Created volume 'k3d-k3s-default-images'
INFO[0005] Creating node 'k3d-k3s-default-server-0'
INFO[0005] Creating node 'k3d-k3s-default-agent-0'
INFO[0005] Creating node 'k3d-k3s-default-agent-1'
INFO[0005] Creating LoadBalancer 'k3d-k3s-default-serverlb'
INFO[0008] Pulling image 'docker.io/rancher/k3d-proxy:v5.0.0-dev.0'
INFO[0010] Starting cluster 'k3s-default'
INFO[0010] Starting servers...
INFO[0010] Starting Node 'k3d-k3s-default-server-0'
INFO[0018] Starting agents...
INFO[0018] Starting Node 'k3d-k3s-default-agent-1'
INFO[0018] Starting Node 'k3d-k3s-default-agent-0'
ERRO[0038] Failed to create docker client
ERRO[0038] Failed Cluster Start: Failed to add one or more agents: Node k3d-k3s-default-agent-1 failed to get ready: Failed waiting for log message 'Successfully registered node' from node 'k3d-k3s-default-agent-1': open /Users/logi/.docker/contexts/tls/fe9c6bd7a66301f49ca9b6a70b217107cd1284598bfc254700c989b916da791e: too many open files
ERRO[0038] Failed to create cluster >>> Rolling Back
INFO[0038] Deleting cluster 'k3s-default'
WARNING: Error loading config file: /Users/logi/.docker/config.json: open /Users/logi/.docker/config.json: too many open files
ERRO[0038] Failed to get nodes for cluster 'k3s-default': Failed to list containers: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22label%22%3A%7B%22app%3Dk3d%22%3Atrue%2C%22k3d.cluster%3Dk3s-default%22%3Atrue%7D%7D&limit=0": dial unix /var/run/docker.sock: socket: too many open files
ERRO[0038] No nodes found for given cluster
FATA[0038] Cluster creation FAILED, also FAILED to rollback changes!
$ kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?
$ k3dv5 cluster list
NAME          SERVERS   AGENTS   LOADBALANCER
k3s-default   1/1       2/2      true
$ k3dv5 cluster delete k3s-default
FATA[0000] error getting loadbalancer config for cluster k3s-default: Error: No such container:path: cde656e48c7195841c13d130a5a0b956f8baf57cf90df977b5b81562a47f65aa:/etc/confd/values.yaml: file not found

How can I remove the cluster now ?

@logileifs
Copy link
Author

I managed to remove the cluster using my old k3d version and I managed to create a cluster with node-labels using this command: k3dv5 cluster create --agents 1 --k3s-node-label "other.label=somevalue@agent:0" but it always fails when I try creating a cluster with this: k3dv5 cluster create --agents 1 --k3s-node-label "node-role.kubernetes.io/worker=worker@agent:0"

@iwilltry42
Copy link
Member

Hi again 👋
Well, WARNING: Error loading config file: /Users/logi/.docker/config.json: open /Users/logi/.docker/config.json: too many open files this is an error of your host system / docker.
If you google only for the last part, you'll find plenty solutions/workarounds for this (mostly similar to this: rclone/rclone#1111 (comment)).

FATA[0000] error getting loadbalancer config for cluster k3s-default: Error: No such container:path: cde656e48c7195841c13d130a5a0b956f8baf57cf90df977b5b81562a47f65aa:/etc/confd/values.yaml: file not found

This however, is a bug that has since been resolved in #683 👍


I managed to remove the cluster using my old k3d version and I managed to create a cluster with node-labels using this command: k3dv5 cluster create --agents 1 --k3s-node-label "other.label=somevalue@agent:0" but it always fails when I try creating a cluster with this: k3dv5 cluster create --agents 1 --k3s-node-label "node-role.kubernetes.io/worker=worker@agent:0"

Now this part makes like no sense to me 🤔
Can you see the logs of the container that you're trying to set that label on? (i.e. docker logs k3d-k3s-default-agent-0)?
I just gave it a try and I see those logs:

E0722 06:38:00.095034       7 server.go:191] "Failed to validate kubelet flags" err="unknown 'kubernetes.io' or 'k8s.io' labels specified with --node-labels: [node-role.kubernetes.io/worker]\n--node-labels in the 'kubernetes.io' namespace must begin with an allowed prefix (kubelet.kubernetes.io, node.kubernetes.io) or be in the specifically allowed set (beta.kubernetes.io/arch, beta.kubernetes.io/instance-type, beta.kubernetes.io/os, failure-domain.beta.kubernetes.io/region, failure-domain.beta.kubernetes.io/zone, kubernetes.io/arch, kubernetes.io/hostname, kubernetes.io/os, node.kubernetes.io/instance-type, topology.kubernetes.io/region, topology.kubernetes.io/zone)"

And then I guess when it goes into the restart loop, you'll run out of file handles at some point (I get the same too many open files message after a while on my Linux system).

@logileifs
Copy link
Author

I found this issue and if I am understanding it correctly it's not possible to set that label anymore when a node is created, only afterwards using kubectl

sdblepas added a commit to sdblepas/k3d-deploy that referenced this issue Sep 3, 2021
#this part is for bug k3d-io/k3d#681 should be remove eventually
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment