-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Bug Report
Describe the bug
Since version 3.2.0 fluent-bit crashes with SIGSEGV on output to azure blob storage and thus is going into a crash-loop.
With Version 3.1.10 fluent-bit behaves normal.
To Reproduce
- Rubular link if applicable:
- Example log message if applicable:
Fluent Bit v3.2.2
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
______ _ _ ______ _ _ _____ _____
| ___| | | | | ___ (_) | |____ |/ __ \
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __ / /`' / /'
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / \ \ / /
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /.___/ /./ /___
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ \____(_)_____/
[2024/12/02 17:25:00] [ info] [fluent bit] version=3.2.2, commit=a59c867924, pid=1
[2024/12/02 17:25:00] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/12/02 17:25:00] [ info] [simd ] disabled
[2024/12/02 17:25:00] [ info] [cmetrics] version=0.9.9
[2024/12/02 17:25:00] [ info] [ctraces ] version=0.5.7
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] initializing
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=0
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] initializing
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] storage_strategy='memory' (memory only)
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] db: delete unmonitored stale inodes from the database: count=0
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] https=1 host=kubernetes.default.svc.cluster.local port=443
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] token updated
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] local POD info OK
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] connectivity OK
[2024/12/02 17:25:00] [ info] [output:es:es.0] worker #0 started
[2024/12/02 17:25:00] [ info] [output:azure_blob:azure_blob.1] account_name=logstideskp2, container_name=access-logs, blob_type=blockblob, emulator_mode=no, endpoint=logstideskp2.blob.core.windows.net, auth_type=key
[2024/12/02 17:25:00] [ info] [output:es:es.0] worker #1 started
[2024/12/02 17:25:00] [ info] [output:azure_blob:azure_blob.1] initializing worker
[2024/12/02 17:25:00] [ info] [output:azure_blob:azure_blob.1] worker #0 started
[2024/12/02 17:25:00] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2024/12/02 17:25:00] [ info] [sp] stream processor started
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=585645 watch_fd=1 name=/var/log/containers/aurora-letsencrypt-cert-manager-cainjector-77547cd645-d5rs5_cert-manager_cert-manager-cainjector-86b5ec5a32f3b6bb3279539443bc6d1d1a162f397b75aed1b2a691e85e60c841.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=579291 watch_fd=2 name=/var/log/containers/crs-559445b4f5-m9988_default_init-crs-ba7349072cf9474905a8d3eecf12637f07a877233e98287bb367933e275e28e0.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=586171 watch_fd=3 name=/var/log/containers/elastic-operator-0_elastic-system_manager-c5c443982a1d231b8b933293ad431f5b2fc13868ed17f44563762f303be2fcf3.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=585776 watch_fd=4 name=/var/log/containers/outputmanager-primary-558d5d778d-52z4c_default_outputmanager-b5085ae3687ef05226fa1e23d301c2212f5a769786f51f37ccb1c3d35327d7ed.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=586402 watch_fd=5 name=/var/log/containers/productconfig-db-prometheus-exporter-5bd89dd994-lshxj_default_mongodb-exporter-bf432bfdaa56e9ae91a92a1ae19b41b7f1771f83536286516e0c92ba87a6c23e.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=585842 watch_fd=6 name=/var/log/containers/static-content-aurora-static-content-deploytest_default_integration-test-40104681518b35c26f87bdceb089236cba991f403694860177ab81d14c9612f4.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] inotify_fs_add(): inode=586553 watch_fd=1 name=/var/log/containers/aurora-controller-ingress-nginx-controller-677859486-5mdqn_ingress_istio-init-7f5970a12e83f8736de2926e51ecc71fe1b51c815a31504eafcf7753c912ec6b.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] inotify_fs_add(): inode=586582 watch_fd=2 name=/var/log/containers/aurora-controller-ingress-nginx-controller-677859486-5mdqn_ingress_istio-proxy-4ac987779d51964574c8ac298b86d153b7ae03d985944474511481f8b47070e2.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=566543 watch_fd=7 name=/var/log/containers/crs-559445b4f5-m9988_default_crs-server-e584a852c412d682d688c42b670a40caa814541fc24f1a446fdd2fd574361c35.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=566547 watch_fd=8 name=/var/log/containers/aurora-controller-ingress-nginx-controller-677859486-5mdqn_ingress_controller-3c1cc39100ea9594e28bc30b903375cfa60cac2af3e153a226c1ed362dcbe39f.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] inotify_fs_add(): inode=566547 watch_fd=3 name=/var/log/containers/aurora-controller-ingress-nginx-controller-677859486-5mdqn_ingress_controller-3c1cc39100ea9594e28bc30b903375cfa60cac2af3e153a226c1ed362dcbe39f.log
[2024/12/02 17:25:04] [ info] [output:azure_blob:azure_blob.1] container 'access-logs' already exists
[2024/12/02 17:25:05] [ info] [output:azure_blob:azure_blob.1] content uploaded successfully: file=kube-ingress-controller.1733160304970
[2024/12/02 17:25:05] [ info] [output:azure_blob:azure_blob.1] blob id ZmxiLTE3MzMxNjAzMDQuOTcwMS5pZA== committed successfully
[2024/12/02 17:25:09] [error] [tls] error: error:0A0C0103:SSL routines::internal error
[2024/12/02 17:25:09] [error] [output:azure_blob:azure_blob.1] error requesting container properties
[2024/12/02 17:25:09] [ warn] [engine] failed to flush chunk '1-1733160305.509332912.flb', retry in 8 seconds: task_id=2, input=tail.1 > output=azure_blob.1 (out_id=1)
[2024/12/02 17:25:14] [engine] caught signal (SIGSEGV)
#0 0x7f109a80d45b in ???() at ???:0
#1 0x7f109a8111b7 in ???() at ???:0
#2 0x7f109a811bb3 in ???() at ???:0
#3 0x7f109a81d06b in ???() at ???:0
#4 0x7f109a81e431 in ???() at ???:0
#5 0x7f109a81eb47 in ???() at ???:0
#6 0x7f109a820335 in ???() at ???:0
#7 0x7f109a82140b in ???() at ???:0
#8 0x7f109a9f08ff in ???() at ???:0#9 0x7f109aa2e48d in ???() at ???:0
#10 0x7f109aa2a6c4 in ???() at ???:0
#11 0x563bc9ddf50c in tls_net_handshake() at src/tls/openssl.c:981
#12 0x563bc9de0214 in flb_tls_session_create() at src/tls/flb_tls.c:556
#13 0x563bc9deee82 in flb_io_net_connect() at src/flb_io.c:175
#14 0x563bc9dc6c40 in create_conn() at src/flb_upstream.c:610
#15 0x563bc9dc6c40 in flb_upstream_conn_get() at src/flb_upstream.c:799
#16 0x563bc9e7e1a7 in ensure_container() at plugins/out_azure_blob/azure_blob.c:546
#17 0x563bc9e7e1a7 in cb_azure_blob_flush() at plugins/out_azure_blob/azure_blob.c:1046
#18 0x563bca2f8c26 in co_init() at lib/monkey/deps/flb_libco/amd64.c:117
#19 0xffffffffffffffff in ???() at ???:0
Expected behavior
Fluent-bit should write the access log entry to the blob container
Screenshots
Your Environment
- Version used: 3.2.0 and 3.2.2
- Configuration:
[SERVICE]
Flush 5
Log_Level info
Daemon off
Parsers_File parsers.conf
Parsers_File custom_parsers.conf
# required for readiness and liveness probes
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
Exclude_Path *kube-*,*azure-*,*blackbox-exporter-*,*blobfuse-flexvol-*,*coredns-*,*istio-*,*node-exporter-*,*keyvault-flexvolume-*,*kiali-*,*fluent-bit*
Parser cri_o
DB /var/log/flb_kube.db
Mem_Buf_Limit 10MB
Skip_Long_Lines On
Refresh_Interval 10
Inotify_Watcher ${INOTIFY_WATCHER_ENABLED}
[INPUT]
Name tail
Tag kube-ingress-controller
Path /var/log/containers/*ingress-nginx-controller*.log
Parser cri_o
DB /var/log/flb_kube_ingress_controller.db
Mem_Buf_Limit 10MB
Skip_Long_Lines On
Refresh_Interval 10
Inotify_Watcher ${INOTIFY_WATCHER_ENABLED}
# note the order of the filters.
# the lua filters must come after the kubernetes filter, otherwise the log field isn't yet decoded
# when reaching the lua code.
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc.cluster.local:443
Merge_Log On
Merge_Log_Trim On
Keep_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude On
Buffer_Size 1Mb
[FILTER]
Name lua
Match *
script ../scripts/extract_tag.lua
call extract_tag
[FILTER]
Name lua
Match kube*crs-server*
script ../scripts/crs.lua
call augment_crs
[FILTER]
Name record_modifier
Match *
Record resource_group prod2402130227
[OUTPUT]
Name es
Match kube*
Host ${ELASTICSEARCH_HOST_PREFIX}-es-http
Port 9200
HTTP_User ${ELASTIC_USERNAME}
HTTP_Passwd ${ELASTIC_PASSWORD}
Buffer_Size 1Mb
Logstash_Format On
Logstash_Prefix fluent-bit
Type _doc
Replace_Dots On
Time_Key @ts
Time_Key_Format %Y-%m-%dT%H:%M:%S
tls On
tls.verify Off
Suppress_Type_Name On
Trace_Error On
[OUTPUT]
name azure_blob
match kube-ingress-controller
account_name ${LOG_STORAGE_ACCOUNT}
shared_key ${LOG_STORAGE_ACCOUNT_KEY}
container_name access-logs
blob_type blockblob
auto_create_container on
tls on
- Environment name and version (e.g. Kubernetes? What version?):
Kubernetes-1.30.5 - Server type and version:
Azure Standard_E4as_v5 - Operating System and version:
Ubuntu-22.04.5 LTS with kernel 5.15.0-1074-azure - Filters and plugins:
(see config)
Additional context
We are running our application on an AKS cluster with kubernetes version 1.30.5.
There are 14 Nodes running currently, having 32GB of memory and 4 CPUs each.
The idea for using the azure_blob plugin is to output a copy of the ingress access logs to a blob storage
On clusters where we don't use the plugin fluent-bit runs just fine.
Likewise, version 3.1.10 runs fine on the same system with the exact same configuration
Fluent-bit is deployed using helm through an umbrella chart with release 0.48.3 from https://fluent.github.io/helm-charts as a dependency.