Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ifname net1 is already exist #353

Closed
Elegant996 opened this issue Jul 30, 2019 · 22 comments
Closed

ifname net1 is already exist #353

Elegant996 opened this issue Jul 30, 2019 · 22 comments

Comments

@Elegant996
Copy link

Elegant996 commented Jul 30, 2019

What happened:

Restarted pod that was using multus annotations and received an error stating that the ifname already exists.

What you expected to happen:

Successfully restart the pod with multus annotations.

How to reproduce it (as minimally and precisely as possible):

Simply deleted the pod, I had done this several times prior but this is the first time I've received an error. No updates were ran since the previous pod restart. Performed a drain of all nodes then rebooted but the issue still persists.

Anything else we need to know?:

Warning  FailedCreatePodSandBox  4m54s                 kubelet, kube-node01.example.com  Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_myapp-0_default_15e30db3-b2f1-11e9-bb27-000c29616bad_0(46d94054e20e92f252f1d169cecd6529ef220ae3cfdb79aaf7208885777c0de3): Multus: Err adding pod to network "macvlan-conf": cannot set "" ifname to "net1": ifname net1 is already exist 

Environment:

  • Multus version nfvpe/multus:latest
    image path and image ID (from 'docker images') docker.io/nfvpe/multus and 9318454f544e
  • Kubernetes version (use kubectl version): 1.14.1
  • Primary CNI for Kubernetes cluster: flannel
  • OS (e.g. from /etc/os-release): Fedora 30
  • File of '/etc/cni/net.d/' 00-multus.conf
  • File of '/etc/cni/multus/net.d' multus.kubeconfig
  • NetworkAttachment info (use kubectl get net-attach-def -o yaml)
apiVersion: v1
items:
- apiVersion: k8s.cni.cncf.io/v1
  kind: NetworkAttachmentDefinition
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"k8s.cni.cncf.io/v1","kind":"NetworkAttachmentDefinition","metadata":{"annotations":{},"name":"macvlan-conf","namespace":"default"},"spec":{"config":"{ \"cniVersion\": \"0.3.1\", \"name\": \"macvlan-conf\", \"plugins\": [ { \"type\": \"macvlan\", \"master\": \"net0\", \"ipam\": { \"type\": \"static\", \"addresses\": [ { \"address\": \"10.0.30.30/24\", \"gateway\": \"10.0.30.1\" } ], \"dns\": { \"nameservers\" : [\"10.0.30.1\"] } } }, { \"type\": \"tuning\", \"sysctl\": { \"net.ipv4.conf.all.arp_filter\": \"1\", \"net.ipv4.conf.default.arp_filter\": \"1\", \"net.ipv4.conf.all.arp_announce\": \"2\", \"net.ipv4.conf.default.arp_announce\": \"2\" } }, { \"type\": \"sbr\" }, { \"type\": \"portmap\", \"capabilities\": { \"portMappings\": true }, \"snat\": false } ] }"}}
    creationTimestamp: "2019-05-22T01:14:33Z"
    generation: 8
    name: macvlan-conf
    namespace: default
    resourceVersion: "23425922"
    selfLink: /apis/k8s.cni.cncf.io/v1/namespaces/default/network-attachment-definitions/macvlan-conf
    uid: f50f1c4e-7c2e-11e9-9fa6-000c29616bad
  spec:
    config: '{ "cniVersion": "0.3.1", "name": "macvlan-conf", "plugins": [ { "type":
      "macvlan", "master": "net0", "ipam": { "type": "static", "addresses": [ { "address":
      "10.0.30.30/24", "gateway": "10.0.30.1" } ], "dns": { "nameservers" : ["10.0.30.1"]
      } } }, { "type": "tuning", "sysctl": { "net.ipv4.conf.all.arp_filter": "1",
      "net.ipv4.conf.default.arp_filter": "1", "net.ipv4.conf.all.arp_announce": "2",
      "net.ipv4.conf.default.arp_announce": "2" } }, { "type": "sbr" }, { "type":
      "portmap", "capabilities": { "portMappings": true }, "snat": false } ] }'
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
  • Target pod yaml info (with annotation, use kubectl get pod <podname> -o yaml)
      annotations:
        k8s.v1.cni.cncf.io/networks: '[
          {
            "name": "macvlan-conf",
            "mac": "7e:1d:44:d8:56:0b"
          }
        ]'
  • Other log outputs (if you use multus logging)
@dougbtv
Copy link
Member

dougbtv commented Jul 30, 2019

Thanks for the report @Elegant996 -- I'm going to try to replicate it here, I'm spinning up a lab environment to try it now...

@dougbtv
Copy link
Member

dougbtv commented Jul 30, 2019

Alright, looks like I've been able to replicate it... Thanks for the great details in the report. I'm pretty sure this happens without me having to delete/restart the pod, it just happens the first time I launch the pod with this configuration. I haven't found a cause for this yet, but, wanted to document my steps...

I'm on the same :latest image...

[centos@kube-netmachine-node-1 ~]$ sudo docker images | grep -P "multus.+latest|REPO"
REPOSITORY               TAG                 IMAGE ID            CREATED             SIZE
nfvpe/multus             latest              9318454f544e        4 days ago          960MB

But a later kube version...

[centos@kube-netmachine-master ~]$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:40:16Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Here's the steps I took to replicate:

[centos@kube-netmachine-master ~]$ cat nad.yaml 
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-conf
spec:
  config: '{
	"cniVersion": "0.3.1",
	"name": "macvlan-conf",
	"plugins": [{
		"type": "macvlan",
		"master": "net0",
		"ipam": {
			"type": "static",
			"addresses": [{
				"address": "10.0.30.30/24",
				"gateway": "10.0.30.1"
			}],
			"dns": {
				"nameservers": ["10.0.30.1"]
			}
		}
	}, {
		"type": "tuning",
		"sysctl": {
			"net.ipv4.conf.all.arp_filter": "1",
			"net.ipv4.conf.default.arp_filter": "1",
			"net.ipv4.conf.all.arp_announce": "2",
			"net.ipv4.conf.default.arp_announce": "2"
		}
	}, {
		"type": "sbr"
	}, {
		"type": "portmap",
		"capabilities": {
			"portMappings": true
		},
		"snat": false
	}]
}'

[centos@kube-netmachine-master ~]$ kubectl create -f nad.yaml 
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-conf created
[centos@kube-netmachine-master ~]$ kubectl get network-attachment-definitions.k8s.cni.cncf.io 
NAME           AGE
macvlan-conf   4s

[centos@kube-netmachine-master ~]$ cat pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: samplepod
  annotations:
    k8s.v1.cni.cncf.io/networks: '[
       {
         "name": "macvlan-conf",
         "mac": "7e:1d:44:d8:56:0b"
       }
     ]'
spec:
  containers:
  - name: samplepod
    command: ["/bin/bash", "-c", "trap : TERM INT; sleep infinity & wait"]
    image: dougbtv/centos-network
[centos@kube-netmachine-master ~]$ kubectl create -f pod.yaml 
pod/samplepod created

And I wind up with approximately the same results...

[centos@kube-netmachine-master ~]$ watch -n1 kubectl get pods -o wide 
[centos@kube-netmachine-master ~]$ kubectl describe pod samplepod | grep -A4 -P "^Events"
Events:
  Type     Reason                  Age                     From                             Message
  ----     ------                  ----                    ----                             -------
  Normal   Scheduled               4m22s                   default-scheduler                Successfully assigned default/samplepod to kube-netmachine-node-1
  Warning  FailedCreatePodSandBox  4m15s                   kubelet, kube-netmachine-node-1  Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "bcc827b1e20b43b5dc902ac553c3beba4a6dae5afee1fc6a8db13a3e53b5c830" network for pod "samplepod": NetworkPlugin cni failed to set up pod "samplepod_default" network: Multus: Err adding pod to network "macvlan-conf": Multus: error in invoke Conflist add - "macvlan-conf": error in getting result from AddNetworkList: failed to lookup master "net0": Link not found, failed to clean up sandbox container "bcc827b1e20b43b5dc902ac553c3beba4a6dae5afee1fc6a8db13a3e53b5c830" network for pod "samplepod": NetworkPlugin cni failed to teardown pod "samplepod_default" network: Multus: error in invoke Conflist Del - "macvlan-conf": error in getting result from DelNetworkList: Failed to get link net1: Link not found / Multus: error in invoke Conflist Del - "cbr0": error in getting result from DelNetworkList: invalid version "": the version is empty]

@dougbtv
Copy link
Member

dougbtv commented Jul 30, 2019

fwiw... the quickstart method still works with this version...

[centos@kube-netmachine-master ~]$ cat <<EOF | kubectl create -f -
> apiVersion: "k8s.cni.cncf.io/v1"
> kind: NetworkAttachmentDefinition
> metadata:
>   name: macvlan-conf
> spec:
>   config: '{
>       "cniVersion": "0.3.0",
>       "type": "macvlan",
>       "master": "eth0",
>       "mode": "bridge",
>       "ipam": {
>         "type": "host-local",
>         "subnet": "192.168.1.0/24",
>         "rangeStart": "192.168.1.200",
>         "rangeEnd": "192.168.1.216",
>         "routes": [
>           { "dst": "0.0.0.0/0" }
>         ],
>         "gateway": "192.168.1.1"
>       }
>     }'
> EOF
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-conf created
[centos@kube-netmachine-master ~]$ cat <<EOF | kubectl create -f -
> apiVersion: v1
> kind: Pod
> metadata:
>   name: samplepod
>   annotations:
>     k8s.v1.cni.cncf.io/networks: macvlan-conf
> spec:
>   containers:
>   - name: samplepod
>     command: ["/bin/bash", "-c", "trap : TERM INT; sleep infinity & wait"]
>     image: dougbtv/centos-network
> EOF
pod/samplepod created
[centos@kube-netmachine-master ~]$ watch -n1 kubectl get pods -o wide 
[centos@kube-netmachine-master ~]$ kubectl get pods -o wide 
NAME        READY   STATUS    RESTARTS   AGE   IP            NODE                     NOMINATED NODE   READINESS GATES
samplepod   1/1     Running   0          13s   10.244.1.82   kube-netmachine-node-1   <none>           <none>
[centos@kube-netmachine-master ~]$ kubectl exec -it samplepod -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if83: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP 
    link/ether 1e:f8:0c:53:7f:9f brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.244.1.82/24 scope global eth0
       valid_lft forever preferred_lft forever
4: net1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 52:8a:f0:96:11:b6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.1.212/24 scope global net1
       valid_lft forever preferred_lft forever

@dougbtv
Copy link
Member

dougbtv commented Jul 30, 2019

@Elegant996 -- just noticed something in your network-attachment-definition config section...

	"plugins": [{
		"type": "macvlan",
		"master": "net0",
		"ipam": {

Is the master actually intended to be net0? For macvlan, typically the master interface is an interface on your host machine... Like eth0 (or a systemd predictable name like ens0)

Just caught my eye on that one.

@dougbtv
Copy link
Member

dougbtv commented Jul 30, 2019

For what it's worth, I edited your net-attach-def to have a master of eth0 (which is what my primary interface is on the hosts in my lab cluster) and when I did so, I can create the pod -- and also delete and recreate it.

For reference:

[centos@kube-netmachine-master ~]$ cat nad.yaml 
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-conf
spec:
  config: '{
	"cniVersion": "0.3.1",
	"name": "macvlan-conf",
	"plugins": [{
		"type": "macvlan",
		"master": "eth0",
		"ipam": {
			"type": "static",
			"addresses": [{
				"address": "10.0.30.30/24",
				"gateway": "10.0.30.1"
			}],
			"dns": {
				"nameservers": ["10.0.30.1"]
			}
		}
	}, {
		"type": "tuning",
		"sysctl": {
			"net.ipv4.conf.all.arp_filter": "1",
			"net.ipv4.conf.default.arp_filter": "1",
			"net.ipv4.conf.all.arp_announce": "2",
			"net.ipv4.conf.default.arp_announce": "2"
		}
	}, {
		"type": "sbr"
	}, {
		"type": "portmap",
		"capabilities": {
			"portMappings": true
		},
		"snat": false
	}]
}'
[centos@kube-netmachine-master ~]$ kubectl create -f nad.yaml 
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-conf created
[centos@kube-netmachine-master ~]$ kubectl create -f pod.yaml 
pod/samplepod created
[centos@kube-netmachine-master ~]$ watch -n1 kubectl get pods -o wide 
[centos@kube-netmachine-master ~]$ kubectl exec -it samplepod -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if84: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP 
    link/ether 82:0d:db:80:49:7c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.244.1.83/24 scope global eth0
       valid_lft forever preferred_lft forever
4: net1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 3a:85:ab:bb:42:e3 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.30.30/24 scope global net1
       valid_lft forever preferred_lft forever
[centos@kube-netmachine-master ~]$ 
[centos@kube-netmachine-master ~]$ 
[centos@kube-netmachine-master ~]$ kubectl delete pod samplepod 
pod "samplepod" deleted
[centos@kube-netmachine-master ~]$ kubectl create -f pod.yaml 
pod/samplepod created
[centos@kube-netmachine-master ~]$ watch -n1 kubectl get pods -o wide 
[centos@kube-netmachine-master ~]$ kubectl exec -it samplepod -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if85: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP 
    link/ether e6:29:12:2d:04:d1 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.244.1.84/24 scope global eth0
       valid_lft forever preferred_lft forever
4: net1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 86:5f:16:ad:4b:45 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.30.30/24 scope global net1
       valid_lft forever preferred_lft forever

@dougbtv
Copy link
Member

dougbtv commented Jul 30, 2019

Sorry for the slew of responses... now that I re-read... The error I created actually differs from yours.

Yours:

Multus: Err adding pod to network "macvlan-conf": cannot set "" ifname to "net1": ifname net1 is already exist 

Mine was:

Multus: error in invoke Conflist Del - "macvlan-conf": error in getting result from DelNetworkList: Failed to get link net1: Link not found / Multus: error in invoke Conflist Del - "cbr0": error in getting result from DelNetworkList: invalid version "": the version is empty]

@Elegant996 -- could you enable debug logging and hook me up with the logs? Add to your CNI configuration in /etc/cni/net.d/00-multus.conf (by default) these params.

"logLevel": "debug",
"logFile": "/var/log/multus.log",

(or any location for logFile)

Also if you can grab the contents of your /etc/cni/net.d/00-multus.conf that'd help, too.

@Elegant996
Copy link
Author

Elegant996 commented Jul 31, 2019

@dougbtv Yes, it is. When I initially set up the OS way back when it decided to pick 'ens192' for the NIC instead of eth0 or 'ens0' so I used the below systemd-networkd link file to change it to net0. This has been working without issue for quite some time though. Thanks!

[Match]
MACAddress=xx:xx:xx:xx:xx:xx

[Link]
MACAddressPolicy=persistent
Name=net0

Output from ip a:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: net0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether 00:0c:29:61:6b:ad brd ff:ff:ff:ff:ff:ff
    inet 10.0.30.10/24 brd 10.0.30.255 scope global net0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe61:6bad/64 scope link
       valid_lft forever preferred_lft forever
3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UNKNOWN group default
    link/ether 3e:e6:f4:52:e9:83 brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::3ce6:f4ff:fe52:e983/64 scope link
       valid_lft forever preferred_lft forever
4: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue state UP group default qlen 1000
    link/ether 36:77:9a:3c:15:d4 brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.1/24 brd 10.244.0.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::3477:9aff:fe3c:15d4/64 scope link
       valid_lft forever preferred_lft forever
5: veth09535afb@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc noqueue master cni0 state UP group default
    link/ether 72:4f:76:2c:ca:44 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::704f:76ff:fe2c:ca44/64 scope link
       valid_lft forever preferred_lft forever

@Elegant996
Copy link
Author

Elegant996 commented Jul 31, 2019

Very bizarre... after MANY attempts over last night and today the pod just decided to work again during one of the crash restarts. It was working previously before all this but I didn't expect it to recover randomly.

I literally got home to add the debug lines and noticed the pod was up and running. I'll close the issue for now as it's kind of difficult to troubleshoot if the issue is no longer present.

Thanks!

@Elegant996
Copy link
Author

Elegant996 commented Aug 16, 2019

Issue is still on going. I just realized that I never stated that crio was being used. I have since upgraded kubernetes to 1.15.2 and crio to 1.15 but there has been no improvement.

I can occasionally get the pod to come up by repeatedly deleting it but this is not a long-term solution.

EDIT: @dougbtv could this be related? That error appears alongside mine in some of the logs.

Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Configure SBR for new interface net1 - previous result: Interfaces:[{Name:net1 Mac:7e:1d:44:d8:56:0b Sandbox:/proc/16876/ns/net}], IP:[{Version:4 Interface:0xc000016c60 Address:{IP:10.0.30.30 Mask:ffffff00} Gateway:10.0.30.1}], DNS:{Nameservers:[] Domain: Search:[] Options:[]}
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Checking for relevant interface: net1
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Found IP address 10.0.30.30
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Network namespace to use and lock: /proc/16876/ns/net
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 First unreferenced table: 100
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Set rule for source {Version:4 Interface:0xc000016c60 Address:{IP:10.0.30.30 Mask:ffffff00} Gateway:10.0.30.1}
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Source to use 10.0.30.30/32
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Adding default route to gateway 10.0.30.1
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Moving route {Ifindex: 4 Dst: 10.0.30.0/24 Src: 10.0.30.30 Gw: <nil> Flags: [] Table: 254} from table 254 to 100
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Moving route {Ifindex: 4 Dst: fe80::/64 Src: <nil> Gw: <nil> Flags: [] Table: 254} from table 254 to 100
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Cleaning up SBR for net1
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Network namespace to use and lock: /proc/16876/ns/net
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Check rule: ip rule -1: from <nil> table 255
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Check rule: ip rule 32765: from 10.0.30.30/32 table 100
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Delete rule ip rule 32765: from 10.0.30.30/32 table 100
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Check rule: ip rule 32766: from <nil> table 254
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Check rule: ip rule 32767: from <nil> table 253
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Check rule: ip rule 32767: from <nil> table 253
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Check rule: ip rule 32767: from <nil> table 254
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Check rule: ip rule -1: from <nil> table 255
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Check rule: ip rule 32766: from <nil> table 254
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Cleaning up SBR for net1
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Network namespace to use and lock: /proc/16876/ns/net
Aug 16 12:45:40 kube-node01.example.com crio[965]: 2019/08/16 12:45:40 Failed to get link net1: Link not found
Aug 16 12:45:40 kube-node01.example.com crio[965]: time="2019-08-16 12:45:40.474382542-04:00" level=error msg="Error adding network: Multus: Err adding pod to network \"macvlan-conf\": cannot set \"\" ifname to \"net1\": ifname net1 is already exist"
Aug 16 12:45:40 kube-node01.example.com crio[965]: time="2019-08-16 12:45:40.474603792-04:00" level=error msg="Error while adding pod to CNI network \"multus-cni-network\": Multus: Err adding pod to network \"macvlan-conf\": cannot set \"\" ifname to \"net1\": ifname net1 is already exist"
Aug 16 12:45:40 kube-node01.example.com crio[965]: time="2019-08-16 12:45:40.474698487-04:00" level=error msg="Error deleting network: invalid version \"\": the version is empty"
Aug 16 12:45:40 kube-node01.example.com crio[965]: time="2019-08-16 12:45:40.474766608-04:00" level=error msg="Error while removing pod from CNI network \"multus-cni-network\": invalid version \"\": the version is empty"

Thanks!

@Elegant996
Copy link
Author

Appeared to be a really weird configuration issue. Re-creating my cluster resolved it. Now running Kubernernetes 1.16.2. Closing. Thanks!

@roncemer
Copy link

roncemer commented Jan 3, 2024

This bug STILL exists, and is causing a ton of problems for me. I'm using multus-cni in k3s, and while it usually works after first installing, it soon gets into a state where it's complaining that the interface already exists. So...if the interface already exists, either re-use it, or delete it and re-create it. Super simple solution. Literally a no-brainer. Why let this issue fester for so many years?

@toelke
Copy link

toelke commented Feb 4, 2024

I found that this bug happens when the configuration file of multus gets created in a weird way. Removing /etc/cni/net.d/00-multus.conf and restarting the multus pod on the node helped.

Before:

{
        "cniVersion": "0.3.1",
        "name": "multus-cni-network",
        "type": "multus",
        "capabilities": {"bandwidth":true,"portMappings":true},
        "kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig",
        "delegates": [
                {"capabilities":{"bandwidth":true,"portMappings":true},"cniVersion":"0.3.1","delegates":[{"capabilities":{"bandwidth":true,"portMappings":true},"cniVersion":"0.3.1","delegates":[{"capabilities":{"bandwidth":true,"portMappings":true},"cniVersion":"0.3.1","delegates":[{"cniVersion":"0.3.1","name":"k8s-pod-network","plugins":[{"datastore_type":"kubernetes","ipam":{"type":"calico-ipam"},"kubernetes":{"kubeconfig":"/etc/cni/net.d/calico-kubeconfig"},"log_file_path":"/var/log/calico/cni/cni.log","log_level":"info","mtu":0,"nodename":"rpi-srv02.gbw-5.okvm.de","policy":{"type":"k8s"},"type":"calico"},{"capabilities":{"portMappings":true},"snat":true,"type":"portmap"},{"capabilities":{"bandwidth":true},"type":"bandwidth"}]}],"kubeconfig":"/etc/cni/net.d/multus.d/multus.kubeconfig","name":"multus-cni-network","type":"multus"}],"kubeconfig":"/etc/cni/net.d/multus.d/multus.kubeconfig","name":"multus-cni-network","type":"multus"}],"kubeconfig":"/etc/cni/net.d/multus.d/multus.kubeconfig","name":"multus-cni-network","type":"multus"}
        ]
}

After:

{
        "cniVersion": "0.3.1",
        "name": "multus-cni-network",
        "type": "multus",
        "capabilities": {"bandwidth":true,"portMappings":true},
        "kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig",
        "delegates": [
                {"cniVersion":"0.3.1","name":"k8s-pod-network","plugins":[{"datastore_type":"kubernetes","ipam":{"type":"calico-ipam"},"kubernetes":{"kubeconfig":"/etc/cni/net.d/calico-kubeconfig"},"log_file_path":"/var/log/calico/cni/cni.log","log_level":"info","mtu":0,"nodename":"rpi-srv02.gbw-5.okvm.de","policy":{"type":"k8s"},"type":"calico"},{"capabilities":{"portMappings":true},"snat":true,"type":"portmap"},{"capabilities":{"bandwidth":true},"type":"bandwidth"}]}
        ]
}

As you can see, there is something recursive going on with the delegates.

@debbabi
Copy link

debbabi commented May 23, 2024

@toelke I got the same problem of recursive delegates using multus with cilium as default cni.
This happens when I delete and re-create pods.
Is this a known bug, or some misconfiguration?

@toelke
Copy link

toelke commented May 23, 2024

I have no more insight into this. I only know to delete the file on the node when I see the "pod is ContainerCreating and weird multus logs".

@hagak
Copy link

hagak commented May 31, 2024

I am also setting the same issue with the same workaround. I was hoping it was a config issue I have.

@kfox1111
Copy link

I found that this bug happens when the configuration file of multus gets created in a weird way. Removing /etc/cni/net.d/00-multus.conf and restarting the multus pod on the node helped.

Before:

{
        "cniVersion": "0.3.1",
        "name": "multus-cni-network",
        "type": "multus",
        "capabilities": {"bandwidth":true,"portMappings":true},
        "kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig",
        "delegates": [
                {"capabilities":{"bandwidth":true,"portMappings":true},"cniVersion":"0.3.1","delegates":[{"capabilities":{"bandwidth":true,"portMappings":true},"cniVersion":"0.3.1","delegates":[{"capabilities":{"bandwidth":true,"portMappings":true},"cniVersion":"0.3.1","delegates":[{"cniVersion":"0.3.1","name":"k8s-pod-network","plugins":[{"datastore_type":"kubernetes","ipam":{"type":"calico-ipam"},"kubernetes":{"kubeconfig":"/etc/cni/net.d/calico-kubeconfig"},"log_file_path":"/var/log/calico/cni/cni.log","log_level":"info","mtu":0,"nodename":"rpi-srv02.gbw-5.okvm.de","policy":{"type":"k8s"},"type":"calico"},{"capabilities":{"portMappings":true},"snat":true,"type":"portmap"},{"capabilities":{"bandwidth":true},"type":"bandwidth"}]}],"kubeconfig":"/etc/cni/net.d/multus.d/multus.kubeconfig","name":"multus-cni-network","type":"multus"}],"kubeconfig":"/etc/cni/net.d/multus.d/multus.kubeconfig","name":"multus-cni-network","type":"multus"}],"kubeconfig":"/etc/cni/net.d/multus.d/multus.kubeconfig","name":"multus-cni-network","type":"multus"}
        ]
}

After:

{
        "cniVersion": "0.3.1",
        "name": "multus-cni-network",
        "type": "multus",
        "capabilities": {"bandwidth":true,"portMappings":true},
        "kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig",
        "delegates": [
                {"cniVersion":"0.3.1","name":"k8s-pod-network","plugins":[{"datastore_type":"kubernetes","ipam":{"type":"calico-ipam"},"kubernetes":{"kubeconfig":"/etc/cni/net.d/calico-kubeconfig"},"log_file_path":"/var/log/calico/cni/cni.log","log_level":"info","mtu":0,"nodename":"rpi-srv02.gbw-5.okvm.de","policy":{"type":"k8s"},"type":"calico"},{"capabilities":{"portMappings":true},"snat":true,"type":"portmap"},{"capabilities":{"bandwidth":true},"type":"bandwidth"}]}
        ]
}

As you can see, there is something recursive going on with the delegates.

I just saw the same issue. Extra multus delegate. removing the file and restarting the multus pod cleared it.

Can someone reopen this issue please?

@sebt3
Copy link

sebt3 commented Jun 16, 2024

I just saw the same issue. Extra multus delegate. removing the file and restarting the multus pod cleared it.

exactly the same here.

Can someone reopen this issue please?

Yes please

@kfox1111
Copy link

kfox1111 commented Jun 17, 2024

Just saw this repeat itself after a cluster reboot! :/

This is very concerning. How do we get it fixed?

multus 4.0.2

@hagak
Copy link

hagak commented Jun 17, 2024

Have you tried the thick version? I noticed after I switched the problem seems to have gone away.

@kfox1111
Copy link

The thick version suffers from #1213

So, both versions seem to have issues with restarts/reboots. :/

@kfox1111
Copy link

Stracing it out, I don't understand how its not getting through... Its like its ignoring --multus-master-cni-file-name, and not filtering the 00-multus.conf. The code looks correct though, so not sure how this is possible.

[pid  8107] execve("/thin_entrypoint", ["/thin_entrypoint", "--multus-master-cni-file-name=10-calico.conflist", "--namespace-isolation=true", "--global-namespaces=default", "--cni-conf-dir=/bitnami/multus-cni/host/etc/cni/net.d", "--multus-autoconfig-dir=/bitnami/multus-cni/host/etc/cni/net.d", "--cni-bin-dir=/bitnami/multus-cni/host/opt/cni/bin", "--multus-log-level=verbose"], 0x55942f9a6be0 /* 55 vars */) = 0
[pid  8107] openat(AT_FDCWD, "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size", O_RDONLY) = 3
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/opt/cni/bin", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
[pid  8107] newfstatat(AT_FDCWD, "/usr/src/multus-cni/bin/multus", {st_mode=S_IFREG|0755, st_size=32793464, ...}, 0) = 0
[pid  8107] --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=8107, si_uid=0} ---
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/opt/cni/bin/_multus", 0xc0001801d8, 0) = -1 ENOENT (No such file or directory)
[pid  8107] openat(AT_FDCWD, "/bitnami/multus-cni/host/opt/cni/bin/_multus3666308831", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0600) = 3
[pid  8107] openat(AT_FDCWD, "/usr/src/multus-cni/bin/multus", O_RDONLY|O_CLOEXEC) = 7
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/opt/cni/bin/multus", {st_mode=S_IFREG|0755, st_size=32793464, ...}, 0) = 0
[pid  8107] newfstatat(AT_FDCWD, "/usr/src/multus-cni/bin/multus", {st_mode=S_IFREG|0755, st_size=32793464, ...}, 0) = 0
[pid  8107] fchmodat(AT_FDCWD, "/bitnami/multus-cni/host/opt/cni/bin/_multus3666308831", 0755) = 0
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/opt/cni/bin/multus", {st_mode=S_IFREG|0755, st_size=32793464, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  8107] renameat(AT_FDCWD, "/bitnami/multus-cni/host/opt/cni/bin/_multus3666308831", AT_FDCWD, "/bitnami/multus-cni/host/opt/cni/bin/multus") = 0
[pid  8107] newfstatat(AT_FDCWD, "/var/run/secrets/kubernetes.io/serviceaccount/token", {st_mode=S_IFREG|0640, st_size=1177, ...}, 0) = 0
[pid  8107] --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=8107, si_uid=0} ---
[pid  8107] newfstatat(AT_FDCWD, "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt", {st_mode=S_IFREG|0644, st_size=1025, ...}, 0) = 0
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/multus.d", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
[pid  8107] openat(AT_FDCWD, "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt", O_RDONLY|O_CLOEXEC) = 3
[pid  8107] openat(AT_FDCWD, "/var/run/secrets/kubernetes.io/serviceaccount/token", O_RDONLY|O_CLOEXEC) = 3
[pid  8107] openat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/multus.d/multus.kubeconfig.new", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0600) = 3
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/multus.d/multus.kubeconfig", {st_mode=S_IFREG|0600, st_size=2876, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  8107] renameat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/multus.d/multus.kubeconfig.new", AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/multus.d/multus.kubeconfig") = 0
[pid  8107] openat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d", O_RDONLY|O_CLOEXEC) = 3
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/multus.d", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/00-multus.conf", {st_mode=S_IFREG|0600, st_size=3508, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/calico-kubeconfig", {st_mode=S_IFREG|0600, st_size=2674, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/10-calico.conflist", {st_mode=S_IFREG|0644, st_size=659, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  8107] openat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/00-multus.conf", O_RDONLY|O_CLOEXEC) = 3
[pid  8107] openat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/00-multus.conf.new", O_WRONLY|O_CREAT|O_CLOEXEC, 0600) = 3
[pid  8107] newfstatat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/00-multus.conf", {st_mode=S_IFREG|0600, st_size=3508, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  8107] renameat(AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/00-multus.conf.new", AT_FDCWD, "/bitnami/multus-cni/host/etc/cni/net.d/00-multus.conf") = 0

@kfox1111
Copy link

I just found:
159f261
and
633985d

The first one in particular is the fix for this bug, and the second one would let me work around the bug but cant without it.

Can someone please cut a 4.0.3 release with fixes? Thin on 4.0.2 is totally broken. :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants