Skip to content
This repository has been archived by the owner on Feb 9, 2022. It is now read-only.

Fluentd Pods stuck in "ContainerCreating" status in AKS post upgrade #1005

Closed
CincomGithubService opened this issue Dec 11, 2020 · 1 comment

Comments

@CincomGithubService
Copy link

We upgraded our AKS cluster from version 1.18.8 to 1.19.3. During the upgrade, the pods were restarted/recreated on upgraded nodes. The Fluend-es-* pods have all been stuck in "ContainerCreating" status since then. We tried uninstalling and re-installing BKPR/kube-prod-runtime as well. All of the pods come up fine, everything else works fine except for these fluentd-es pods which give this error messages : "Unable to attach or mount volumes: unmounted volumes=[varlibdockercontainers], unattached volumes=[varlogbuffers varlogpos fluentd-es-token-6sttb config configd varlibdockercontainers varlog]: timed out waiting for the condition". The other error message that we see in the events for these fluentd pods is "MountVolume.SetUp failed for volume "varlibdockercontainers" : hostPath type check failed: /var/lib/docker/containers is not a directory" prior to giving the other error.

We have not been able to figure out a solution. If someone can assist us, we would really appreciate.

kubectl get all --namespace=kubeprod
NAME READY STATUS RESTARTS AGE
pod/alertmanager-0 2/2 Running 0 30m
pod/cert-manager-666c8b7f77-8l6pc 1/1 Running 0 30m
pod/elasticsearch-logging-0 2/2 Running 0 30m
pod/elasticsearch-logging-1 2/2 Running 0 30m
pod/elasticsearch-logging-2 2/2 Running 0 30m
pod/external-dns-77d4fc59bf-f4xwl 1/1 Running 0 30m
pod/fluentd-es-2ksjg 0/1 ContainerCreating 0 30m
pod/fluentd-es-6c892 0/1 ContainerCreating 0 30m
pod/fluentd-es-9jf4n 0/1 ContainerCreating 0 30m
pod/fluentd-es-rq4m9 0/1 ContainerCreating 0 28m
pod/fluentd-es-x9bcr 0/1 ContainerCreating 0 30m
pod/grafana-0 1/1 Running 0 30m
pod/kibana-58446f5b6-xxns4 1/1 Running 0 30m
pod/kube-state-metrics-848584cb68-vrtkq 2/2 Running 0 30m
pod/nginx-ingress-controller-564c9845cf-7wf97 1/1 Running 0 30m
pod/nginx-ingress-controller-564c9845cf-xj97x 1/1 Running 0 30m
pod/node-exporter-77bdd 1/1 Running 0 30m
pod/node-exporter-9pxml 1/1 Running 0 30m
pod/node-exporter-bt6sp 1/1 Running 0 30m
pod/node-exporter-nf4zw 1/1 Running 0 28m
pod/node-exporter-sh7z7 1/1 Running 0 30m
pod/oauth2-proxy-6fd457b756-jpsgr 1/1 Running 0 30m
pod/oauth2-proxy-6fd457b756-lccht 1/1 Running 0 30m
pod/prometheus-0 2/2 Running 0 30m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager ClusterIP 172.29.133.30 9093/TCP 30m
service/elasticsearch-logging ClusterIP None 9200/TCP 30m
service/grafana ClusterIP 172.29.134.113 3000/TCP 30m
service/kibana-logging ClusterIP 172.29.133.137 5601/TCP 30m
service/nginx-ingress LoadBalancer 172.29.133.234 52.253.76.150 80:30008/TCP,443:31158/TCP 30m
service/oauth2-proxy ClusterIP 172.29.132.90 4180/TCP 30m
service/prometheus ClusterIP 172.29.133.41 9090/TCP 30m

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/fluentd-es 5 5 0 5 0 30m
daemonset.apps/node-exporter 5 5 5 5 5 30m

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cert-manager 1/1 1 1 30m
deployment.apps/external-dns 1/1 1 1 30m
deployment.apps/kibana 1/1 1 1 30m
deployment.apps/kube-state-metrics 1/1 1 1 30m
deployment.apps/nginx-ingress-controller 2/2 2 2 30m
deployment.apps/oauth2-proxy 2/2 2 2 30m

NAME DESIRED CURRENT READY AGE
replicaset.apps/cert-manager-666c8b7f77 1 1 1 30m
replicaset.apps/external-dns-77d4fc59bf 1 1 1 30m
replicaset.apps/kibana-58446f5b6 1 1 1 30m
replicaset.apps/kube-state-metrics-848584cb68 1 1 1 30m
replicaset.apps/nginx-ingress-controller-564c9845cf 2 2 2 30m
replicaset.apps/oauth2-proxy-6fd457b756 2 2 2 30m

NAME READY AGE
statefulset.apps/alertmanager 1/1 30m
statefulset.apps/elasticsearch-logging 3/3 30m
statefulset.apps/grafana 1/1 30m
statefulset.apps/prometheus 1/1 30m

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/nginx-ingress-controller Deployment/nginx-ingress-controller 9%/80% 2 10 2 30m
horizontalpodautoscaler.autoscaling/oauth2-proxy Deployment/oauth2-proxy 10%/80% 2 10 2 30m

NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cronjob.batch/elasticsearch-curator 10 10 * * * False 0 30m

kubectl describe pod/fluentd-es-x9bcr --namespace=kubeprod
Name: fluentd-es-x9bcr
Namespace: kubeprod
Priority: 0
Node: aks-nodepool1-50503828-vmss000000/172.29.128.4
Start Time: Thu, 10 Dec 2020 21:22:42 -0500
Labels: controller-revision-hash=7b4497658c
name=fluentd-es
pod-template-generation=1
Annotations: prometheus.io/path: /metrics
prometheus.io/port: 24231
prometheus.io/scrape: true
scheduler.alpha.kubernetes.io/critical-pod:
Status: Pending
IP:
IPs:
Controlled By: DaemonSet/fluentd-es
Containers:
fluentd-es:
Container ID:
Image: bitnami/fluentd:1.11.1-debian-10-r27
Image ID:
Port:
Host Port:
Command:
fluentd
Args:
--config=/opt/bitnami/fluentd/conf/fluentd.conf
--plugin=/opt/bitnami/fluentd/plugins
--log=/opt/bitnami/fluentd/logs/fluentd.log
--log-rotate-age=5
--log-rotate-size=104857600
--no-supervisor
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 500Mi
Requests:
cpu: 100m
memory: 200Mi
Environment:
ES_HOST: elasticsearch-logging.kubeprod.svc
Mounts:
/opt/bitnami/fluentd/conf from config (ro)
/opt/bitnami/fluentd/conf/config.d from configd (ro)
/var/lib/docker/containers from varlibdockercontainers (ro)
/var/log from varlog (ro)
/var/log/fluentd-buffers from varlogbuffers (rw)
/var/log/fluentd-pos from varlogpos (rw)
/var/run/secrets/kubernetes.io/serviceaccount from fluentd-es-token-6sttb (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: fluentd-es-30f242f
Optional: false
configd:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: fluentd-es-configd-cbf6e63
Optional: false
varlibdockercontainers:
Type: HostPath (bare host directory volume)
Path: /var/lib/docker/containers
HostPathType: Directory
varlog:
Type: HostPath (bare host directory volume)
Path: /var/log
HostPathType: Directory
varlogbuffers:
Type: HostPath (bare host directory volume)
Path: /var/log/fluentd-buffers
HostPathType: DirectoryOrCreate
varlogpos:
Type: HostPath (bare host directory volume)
Path: /var/log/fluentd-pos
HostPathType: DirectoryOrCreate
fluentd-es-token-6sttb:
Type: Secret (a volume populated by a Secret)
SecretName: fluentd-es-token-6sttb
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message


Normal Scheduled 31m default-scheduler Successfully assigned kubeprod/fluentd-es-x9bcr to aks-nodepool1-50503828-vmss000000
Warning FailedMount 29m kubelet, aks-nodepool1-50503828-vmss000000 Unable to attach or mount volumes: unmounted volumes=[varlibdockercontainers], unattached volumes=[config configd varlibdockercontainers varlog varlogbuffers varlogpos fluentd-es-token-6sttb]: timed out waiting for the condition
Warning FailedMount 24m kubelet, aks-nodepool1-50503828-vmss000000 Unable to attach or mount volumes: unmounted volumes=[varlibdockercontainers], unattached volumes=[varlibdockercontainers varlog varlogbuffers varlogpos fluentd-es-token-6sttb config configd]: timed out waiting for the condition
Warning FailedMount 18m (x2 over 22m) kubelet, aks-nodepool1-50503828-vmss000000 Unable to attach or mount volumes: unmounted volumes=[varlibdockercontainers], unattached volumes=[fluentd-es-token-6sttb config configd varlibdockercontainers varlog varlogbuffers varlogpos]: timed out waiting for the condition
Warning FailedMount 15m (x2 over 27m) kubelet, aks-nodepool1-50503828-vmss000000 Unable to attach or mount volumes: unmounted volumes=[varlibdockercontainers], unattached volumes=[configd varlibdockercontainers varlog varlogbuffers varlogpos fluentd-es-token-6sttb config]: timed out waiting for the condition
Warning FailedMount 13m kubelet, aks-nodepool1-50503828-vmss000000 Unable to attach or mount volumes: unmounted volumes=[varlibdockercontainers], unattached volumes=[varlogpos fluentd-es-token-6sttb config configd varlibdockercontainers varlog varlogbuffers]: timed out waiting for the condition
Warning FailedMount 11m (x2 over 20m) kubelet, aks-nodepool1-50503828-vmss000000 Unable to attach or mount volumes: unmounted volumes=[varlibdockercontainers], unattached volumes=[varlogbuffers varlogpos fluentd-es-token-6sttb config configd varlibdockercontainers varlog]: timed out waiting for the condition
Warning FailedMount 8m55s kubelet, aks-nodepool1-50503828-vmss000000 Unable to attach or mount volumes: unmounted volumes=[varlibdockercontainers], unattached volumes=[varlog varlogbuffers varlogpos fluentd-es-token-6sttb config configd varlibdockercontainers]: timed out waiting for the condition
Warning FailedMount 51s (x23 over 31m) kubelet, aks-nodepool1-50503828-vmss000000 MountVolume.SetUp failed for volume "varlibdockercontainers" : hostPath type check failed: /var/lib/docker/containers is not a directory

@CincomGithubService
Copy link
Author

Closing the issue as we resolved the issue. We created the missing directory "/var/lib/docker/containers" on each of the nodes in the cluster using a node shell and all the fluentd pods came up fine after that. We overlooked the error on our part.

Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant