Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cilium v1.14.5 not working #3676

Closed
sylr opened this issue Dec 16, 2023 · 4 comments
Closed

Cilium v1.14.5 not working #3676

sylr opened this issue Dec 16, 2023 · 4 comments
Labels
type/bug Something isn't working

Comments

@sylr
Copy link

sylr commented Dec 16, 2023

Platform I'm building on:

We have made cilium 1.13 work with bottlerocket 1.16 but now with cilium 1.14 and bottlerocket 1.17 our nodes fail to become ready.

What I expected to happen:

Cilium to start successfully.

What actually happened:

Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Pulling    117s               kubelet            Pulling image "artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b"
  Normal   Scheduled  116s               default-scheduler  Successfully assigned kube-system/cilium-bottlerocket-kzrxm to ip-10-132-29-78.eu-central-1.compute.internal
  Normal   Pulled     110s               kubelet            Successfully pulled image "artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b" in 6.038s (6.038s including waiting)
  Normal   Created    110s               kubelet            Created container config
  Normal   Started    109s               kubelet            Started container config
  Normal   Pulled     104s               kubelet            Container image "artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b" already present on machine
  Normal   Created    104s               kubelet            Created container mount-cgroup
  Normal   Started    104s               kubelet            Started container mount-cgroup
  Normal   Pulled     103s               kubelet            Container image "artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b" already present on machine
  Normal   Created    103s               kubelet            Created container apply-sysctl-overwrites
  Normal   Started    103s               kubelet            Started container apply-sysctl-overwrites
  Normal   Created    102s               kubelet            Created container mount-bpf-fs
  Normal   Pulled     102s               kubelet            Container image "artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b" already present on machine
  Normal   Started    102s               kubelet            Started container mount-bpf-fs
  Normal   Pulled     101s               kubelet            Container image "artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b" already present on machine
  Normal   Created    101s               kubelet            Created container clean-cilium-state
  Normal   Started    101s               kubelet            Started container clean-cilium-state
  Normal   Pulled     100s               kubelet            Container image "artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b" already present on machine
  Normal   Created    100s               kubelet            Created container install-cni-binaries
  Normal   Started    100s               kubelet            Started container install-cni-binaries
  Warning  Failed     99s                kubelet            Error: failed to generate container "cdf6b5900d6dd2d05f6bf600cbd21e485bc0a344654352cc8fb2cbfe023be211" spec: failed to apply OCI options: path "/boot" is mounted on "/boot" but it is not a shared or slave mount
  Warning  Failed     87s                kubelet            Error: failed to generate container "8ed081713770c93928a13fd214d674e448ca66a09814d958230080462aa91d59" spec: failed to apply OCI options: path "/boot" is mounted on "/boot" but it is not a shared or slave mount
  Normal   Pulled     75s (x3 over 99s)  kubelet            Container image "artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b" already present on machine
  Warning  Failed     75s                kubelet            Error: failed to generate container "7b04455bd6d4557830b4ab4de92a73440b15af61c61cb176da321da669cdefd5" spec: failed to apply OCI options: path "/boot" is mounted on "/boot" but it is not a shared or slave mount

How to reproduce the problem:

Cilium daemonset

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app.kubernetes.io/name: cilium-agent
    app.kubernetes.io/part-of: cilium
    k8s-app: cilium
  name: cilium-bottlerocket
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: cilium
  template:
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/apply-sysctl-overwrites: unconfined
        container.apparmor.security.beta.kubernetes.io/cilium-agent: unconfined
        container.apparmor.security.beta.kubernetes.io/clean-cilium-state: unconfined
        container.apparmor.security.beta.kubernetes.io/mount-cgroup: unconfined
        enable.version-checker.io/cilium-agent: "true"
        prometheus.io/port: "9962"
        prometheus.io/scrape: "true"
      labels:
        app.kubernetes.io/name: cilium-agent
        app.kubernetes.io/part-of: cilium
        k8s-app: cilium
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
            - matchExpressions:
              - key: eks.amazonaws.com/compute-type
                operator: NotIn
                values:
                - fargate
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: k8s-app
                operator: In
                values:
                - cilium
            topologyKey: kubernetes.io/hostname
      automountServiceAccountToken: true
      containers:
      - args:
        - --config-dir=/tmp/cilium/config-map
        command:
        - cilium-agent
        env:
        - name: K8S_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: CILIUM_K8S_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: CILIUM_CLUSTERMESH_CONFIG
          value: /var/lib/cilium/clustermesh/
        - name: KUBERNETES_SERVICE_HOST
          value: B5229F69B52611640783AA0C9A81E665.gr7.eu-central-1.eks.amazonaws.com
        image: artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b
        imagePullPolicy: IfNotPresent
        lifecycle:
          postStart:
            exec:
              command:
              - bash
              - -c
              - |
                set -o errexit
                set -o pipefail
                set -o nounset

                # When running in AWS ENI mode, it's likely that 'aws-node' has
                # had a chance to install SNAT iptables rules. These can result
                # in dropped traffic, so we should attempt to remove them.
                # We do it using a 'postStart' hook since this may need to run
                # for nodes which might have already been init'ed but may still
                # have dangling rules. This is safe because there are no
                # dependencies on anything that is part of the startup script
                # itself, and can be safely run multiple times per node (e.g. in
                # case of a restart).
                if [[ "$(iptables-save | grep -c 'AWS-SNAT-CHAIN|AWS-CONNMARK-CHAIN')" != "0" ]];
                then
                    echo 'Deleting iptables rules created by the AWS CNI VPC plugin'
                    iptables-save | grep -v 'AWS-SNAT-CHAIN|AWS-CONNMARK-CHAIN' | iptables-restore
                fi
                echo 'Done!'
          preStop:
            exec:
              command:
              - /cni-uninstall.sh
        livenessProbe:
          failureThreshold: 10
          httpGet:
            host: 127.0.0.1
            httpHeaders:
            - name: brief
              value: "true"
            path: /healthz
            port: 9879
            scheme: HTTP
          periodSeconds: 30
          successThreshold: 1
          timeoutSeconds: 5
        name: cilium-agent
        ports:
        - containerPort: 4244
          hostPort: 4244
          name: peer-service
          protocol: TCP
        - containerPort: 9962
          hostPort: 9962
          name: prometheus
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            host: 127.0.0.1
            httpHeaders:
            - name: brief
              value: "true"
            path: /healthz
            port: 9879
            scheme: HTTP
          periodSeconds: 30
          successThreshold: 1
          timeoutSeconds: 5
        resources: {}
        securityContext:
          capabilities:
            add:
            - CHOWN
            - KILL
            - NET_ADMIN
            - NET_RAW
            - IPC_LOCK
            - SYS_MODULE
            - SYS_ADMIN
            - SYS_RESOURCE
            - DAC_OVERRIDE
            - FOWNER
            - SETGID
            - SETUID
            drop:
            - ALL
          seLinuxOptions:
            level: s0
            type: spc_t
        startupProbe:
          failureThreshold: 105
          httpGet:
            host: 127.0.0.1
            httpHeaders:
            - name: brief
              value: "true"
            path: /healthz
            port: 9879
            scheme: HTTP
          periodSeconds: 2
          successThreshold: 1
          timeoutSeconds: 1
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /host/proc/sys/net
          name: host-proc-sys-net
        - mountPath: /host/proc/sys/kernel
          name: host-proc-sys-kernel
        - mountPath: /sys/fs/bpf
          mountPropagation: HostToContainer
          name: bpf-maps
        - mountPath: /var/run/cilium
          name: cilium-run
        - mountPath: /host/etc/cni/net.d
          name: etc-cni-netd
        - mountPath: /var/lib/cilium/clustermesh
          name: clustermesh-secrets
          readOnly: true
        - mountPath: /etc/config
          name: ip-masq-agent
          readOnly: true
        - mountPath: /lib/modules
          name: lib-modules
          readOnly: true
        - mountPath: /run/xtables.lock
          name: xtables-lock
        - mountPath: /var/lib/cilium/tls/hubble
          name: hubble-tls
          readOnly: true
        - mountPath: /tmp
          name: tmp
        - mountPath: /boot
          mountPropagation: HostToContainer
          name: host-boot
          readOnly: true
      dnsPolicy: ClusterFirst
      hostNetwork: true
      imagePullSecrets:
      - name: docker-pull-config
      initContainers:
      - command:
        - cilium
        - build-config
        env:
        - name: K8S_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: CILIUM_K8S_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: KUBERNETES_SERVICE_HOST
          value: B5229F69B52611640783AA0C9A81E665.gr7.eu-central-1.eks.amazonaws.com
        image: artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b
        imagePullPolicy: IfNotPresent
        name: config
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /tmp
          name: tmp
      - command:
        - sh
        - -ec
        - |
          cp /usr/bin/cilium-mount /hostbin/cilium-mount;
          nsenter --cgroup=/hostproc/1/ns/cgroup --mount=/hostproc/1/ns/mnt "${BIN_PATH}/cilium-mount" $CGROUP_ROOT;
          rm /hostbin/cilium-mount
        env:
        - name: CGROUP_ROOT
          value: /run/cilium/cgroupv2
        - name: BIN_PATH
          value: /opt/cni/bin
        image: artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b
        imagePullPolicy: IfNotPresent
        name: mount-cgroup
        resources: {}
        securityContext:
          capabilities:
            add:
            - SYS_ADMIN
            - SYS_CHROOT
            - SYS_PTRACE
            drop:
            - ALL
          seLinuxOptions:
            level: s0
            type: spc_t
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /hostproc
          name: hostproc
        - mountPath: /hostbin
          name: cni-path
      - command:
        - sh
        - -ec
        - |
          cp /usr/bin/cilium-sysctlfix /hostbin/cilium-sysctlfix;
          nsenter --mount=/hostproc/1/ns/mnt "${BIN_PATH}/cilium-sysctlfix";
          rm /hostbin/cilium-sysctlfix
        env:
        - name: BIN_PATH
          value: /opt/cni/bin
        image: artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b
        imagePullPolicy: IfNotPresent
        name: apply-sysctl-overwrites
        resources: {}
        securityContext:
          capabilities:
            add:
            - SYS_ADMIN
            - SYS_CHROOT
            - SYS_PTRACE
            drop:
            - ALL
          seLinuxOptions:
            level: s0
            type: spc_t
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /hostproc
          name: hostproc
        - mountPath: /hostbin
          name: cni-path
      - args:
        - mount | grep "/sys/fs/bpf type bpf" || mount -t bpf bpf /sys/fs/bpf
        command:
        - /bin/bash
        - -c
        - --
        image: artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b
        imagePullPolicy: IfNotPresent
        name: mount-bpf-fs
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /sys/fs/bpf
          mountPropagation: Bidirectional
          name: bpf-maps
      - command:
        - /init-container.sh
        env:
        - name: CILIUM_ALL_STATE
          valueFrom:
            configMapKeyRef:
              key: clean-cilium-state
              name: cilium-config
              optional: true
        - name: CILIUM_BPF_STATE
          valueFrom:
            configMapKeyRef:
              key: clean-cilium-bpf-state
              name: cilium-config
              optional: true
        - name: KUBERNETES_SERVICE_HOST
          value: B5229F69B52611640783AA0C9A81E665.gr7.eu-central-1.eks.amazonaws.com
        image: artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b
        imagePullPolicy: IfNotPresent
        name: clean-cilium-state
        resources: {}
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
            - SYS_MODULE
            - SYS_ADMIN
            - SYS_RESOURCE
            drop:
            - ALL
          seLinuxOptions:
            level: s0
            type: spc_t
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /sys/fs/bpf
          name: bpf-maps
        - mountPath: /run/cilium/cgroupv2
          mountPropagation: HostToContainer
          name: cilium-cgroup
        - mountPath: /var/run/cilium
          name: cilium-run
      - command:
        - /install-plugin.sh
        image: artifactory.company.com/docker/cilium/cilium:v1.14.5@sha256:d3b287029755b6a47dee01420e2ea469469f1b174a2089c10af7e5e9289ef05b
        imagePullPolicy: IfNotPresent
        name: install-cni-binaries
        resources:
          requests:
            cpu: 100m
            memory: 10Mi
        securityContext:
          capabilities:
            drop:
            - ALL
          seLinuxOptions:
            level: s0
            type: spc_t
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /host/opt/cni/bin
          name: cni-path
      nodeSelector:
        kubernetes.io/os: linux
        company.com/os: bottlerocket
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: cilium
      serviceAccountName: cilium
      terminationGracePeriodSeconds: 1
      tolerations:
      - key: node.cilium.io/agent-not-ready
        operator: Exists
      - effect: NoSchedule
        key: company.com/dedicated
        operator: Exists
      - effect: NoSchedule
        key: company.com/os
        operator: Equal
        value: bottlerocket
      - effect: NoSchedule
        key: node.kubernetes.io/not-ready
        operator: Exists
      - effect: NoSchedule
        key: karpenter.sh/not-ready
        operator: Exists
      volumes:
      - emptyDir: {}
        name: tmp
      - hostPath:
          path: /var/run/cilium
          type: DirectoryOrCreate
        name: cilium-run
      - hostPath:
          path: /sys/fs/bpf
          type: DirectoryOrCreate
        name: bpf-maps
      - hostPath:
          path: /proc
          type: Directory
        name: hostproc
      - hostPath:
          path: /run/cilium/cgroupv2
          type: DirectoryOrCreate
        name: cilium-cgroup
      - hostPath:
          path: /opt/cni/bin
          type: DirectoryOrCreate
        name: cni-path
      - hostPath:
          path: /etc/cni/net.d
          type: DirectoryOrCreate
        name: etc-cni-netd
      - hostPath:
          path: /lib/modules
          type: ""
        name: lib-modules
      - hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
        name: xtables-lock
      - hostPath:
          path: /tmp/cilium-bootstrap.d
          type: DirectoryOrCreate
        name: cilium-bootstrap-file-dir
      - name: clustermesh-secrets
        projected:
          defaultMode: 256
          sources:
          - secret:
              name: cilium-clustermesh
              optional: true
          - secret:
              items:
              - key: tls.key
                path: common-etcd-client.key
              - key: tls.crt
                path: common-etcd-client.crt
              - key: ca.crt
                path: common-etcd-client-ca.crt
              name: clustermesh-apiserver-remote-cert
              optional: true
      - configMap:
          defaultMode: 420
          items:
          - key: config
            path: ip-masq-agent
          name: ip-masq-agent
          optional: true
        name: ip-masq-agent
      - hostPath:
          path: /proc/sys/net
          type: Directory
        name: host-proc-sys-net
      - hostPath:
          path: /proc/sys/kernel
          type: Directory
        name: host-proc-sys-kernel
      - name: hubble-tls
        projected:
          defaultMode: 256
          sources:
          - secret:
              items:
              - key: tls.crt
                path: server.crt
              - key: tls.key
                path: server.key
              - key: ca.crt
                path: client-ca.crt
              name: hubble-server-certs
              optional: true
      - hostPath:
          path: /boot
          type: Directory
        name: host-boot
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 2
    type: RollingUpdate
@sylr sylr added status/needs-triage Pending triage or re-evaluation type/bug Something isn't working labels Dec 16, 2023
@bcressey
Copy link
Contributor

Just a guess based on this error:

failed to apply OCI options: path "/boot" is mounted on "/boot" but it is not a shared or slave mount

This may be a side effect of the fix in #3591 that stopped overmounting the real /boot at runtime.

You may be able to work around this by removing the mount propagation line in the pod spec:

- mountPath: /boot
  # try removing this:
  # mountPropagation: HostToContainer
  name: host-boot
  readOnly: true

There shouldn't be any additional mount activity happening on /boot at runtime, so I don't think mount propagation will matter for Cilium.

For a potential fix, we could try adding MS_SHARED to the mount flags here.

@sylr
Copy link
Author

sylr commented Dec 20, 2023

Thanks @bcressey, it seems to work alright.

@yeazelm yeazelm removed the status/needs-triage Pending triage or re-evaluation label Dec 20, 2023
@yeazelm
Copy link
Contributor

yeazelm commented Dec 20, 2023

It sounds like this solved your issue @sylr. Is it ok to resolve this issue?

@sylr
Copy link
Author

sylr commented Dec 20, 2023

I guess, maybe @bcressey wants to act on his MS_SHARED proposition ?

@sylr sylr closed this as completed Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants