Skip to content

Release 3.2 of fluent-bit crashes on azure-blob support #9677

@GottfriedGanssauge

Description

@GottfriedGanssauge

Bug Report

Describe the bug
Since version 3.2.0 fluent-bit crashes with SIGSEGV on output to azure blob storage and thus is going into a crash-loop.
With Version 3.1.10 fluent-bit behaves normal.

To Reproduce

  • Rubular link if applicable:
  • Example log message if applicable:
Fluent Bit v3.2.2
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  _____ 
|  ___| |                | |   | ___ (_) |         |____ |/ __  \
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`' / /'
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \  / /  
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /./ /___
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)_____/


[2024/12/02 17:25:00] [ info] [fluent bit] version=3.2.2, commit=a59c867924, pid=1
[2024/12/02 17:25:00] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/12/02 17:25:00] [ info] [simd    ] disabled
[2024/12/02 17:25:00] [ info] [cmetrics] version=0.9.9
[2024/12/02 17:25:00] [ info] [ctraces ] version=0.5.7
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] initializing
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=0
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] initializing
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] storage_strategy='memory' (memory only)
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] db: delete unmonitored stale inodes from the database: count=0
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] https=1 host=kubernetes.default.svc.cluster.local port=443
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0]  token updated
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] local POD info OK
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2024/12/02 17:25:00] [ info] [filter:kubernetes:kubernetes.0] connectivity OK
[2024/12/02 17:25:00] [ info] [output:es:es.0] worker #0 started
[2024/12/02 17:25:00] [ info] [output:azure_blob:azure_blob.1] account_name=logstideskp2, container_name=access-logs, blob_type=blockblob, emulator_mode=no, endpoint=logstideskp2.blob.core.windows.net, auth_type=key
[2024/12/02 17:25:00] [ info] [output:es:es.0] worker #1 started
[2024/12/02 17:25:00] [ info] [output:azure_blob:azure_blob.1] initializing worker
[2024/12/02 17:25:00] [ info] [output:azure_blob:azure_blob.1] worker #0 started
[2024/12/02 17:25:00] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2024/12/02 17:25:00] [ info] [sp] stream processor started
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=585645 watch_fd=1 name=/var/log/containers/aurora-letsencrypt-cert-manager-cainjector-77547cd645-d5rs5_cert-manager_cert-manager-cainjector-86b5ec5a32f3b6bb3279539443bc6d1d1a162f397b75aed1b2a691e85e60c841.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=579291 watch_fd=2 name=/var/log/containers/crs-559445b4f5-m9988_default_init-crs-ba7349072cf9474905a8d3eecf12637f07a877233e98287bb367933e275e28e0.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=586171 watch_fd=3 name=/var/log/containers/elastic-operator-0_elastic-system_manager-c5c443982a1d231b8b933293ad431f5b2fc13868ed17f44563762f303be2fcf3.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=585776 watch_fd=4 name=/var/log/containers/outputmanager-primary-558d5d778d-52z4c_default_outputmanager-b5085ae3687ef05226fa1e23d301c2212f5a769786f51f37ccb1c3d35327d7ed.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=586402 watch_fd=5 name=/var/log/containers/productconfig-db-prometheus-exporter-5bd89dd994-lshxj_default_mongodb-exporter-bf432bfdaa56e9ae91a92a1ae19b41b7f1771f83536286516e0c92ba87a6c23e.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=585842 watch_fd=6 name=/var/log/containers/static-content-aurora-static-content-deploytest_default_integration-test-40104681518b35c26f87bdceb089236cba991f403694860177ab81d14c9612f4.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] inotify_fs_add(): inode=586553 watch_fd=1 name=/var/log/containers/aurora-controller-ingress-nginx-controller-677859486-5mdqn_ingress_istio-init-7f5970a12e83f8736de2926e51ecc71fe1b51c815a31504eafcf7753c912ec6b.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] inotify_fs_add(): inode=586582 watch_fd=2 name=/var/log/containers/aurora-controller-ingress-nginx-controller-677859486-5mdqn_ingress_istio-proxy-4ac987779d51964574c8ac298b86d153b7ae03d985944474511481f8b47070e2.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=566543 watch_fd=7 name=/var/log/containers/crs-559445b4f5-m9988_default_crs-server-e584a852c412d682d688c42b670a40caa814541fc24f1a446fdd2fd574361c35.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.0] inotify_fs_add(): inode=566547 watch_fd=8 name=/var/log/containers/aurora-controller-ingress-nginx-controller-677859486-5mdqn_ingress_controller-3c1cc39100ea9594e28bc30b903375cfa60cac2af3e153a226c1ed362dcbe39f.log
[2024/12/02 17:25:00] [ info] [input:tail:tail.1] inotify_fs_add(): inode=566547 watch_fd=3 name=/var/log/containers/aurora-controller-ingress-nginx-controller-677859486-5mdqn_ingress_controller-3c1cc39100ea9594e28bc30b903375cfa60cac2af3e153a226c1ed362dcbe39f.log
[2024/12/02 17:25:04] [ info] [output:azure_blob:azure_blob.1] container 'access-logs' already exists
[2024/12/02 17:25:05] [ info] [output:azure_blob:azure_blob.1] content uploaded successfully: file=kube-ingress-controller.1733160304970
[2024/12/02 17:25:05] [ info] [output:azure_blob:azure_blob.1] blob id ZmxiLTE3MzMxNjAzMDQuOTcwMS5pZA== committed successfully
[2024/12/02 17:25:09] [error] [tls] error: error:0A0C0103:SSL routines::internal error
[2024/12/02 17:25:09] [error] [output:azure_blob:azure_blob.1] error requesting container properties
[2024/12/02 17:25:09] [ warn] [engine] failed to flush chunk '1-1733160305.509332912.flb', retry in 8 seconds: task_id=2, input=tail.1 > output=azure_blob.1 (out_id=1)
[2024/12/02 17:25:14] [engine] caught signal (SIGSEGV)
#0  0x7f109a80d45b      in  ???() at ???:0
#1  0x7f109a8111b7      in  ???() at ???:0
#2  0x7f109a811bb3      in  ???() at ???:0
#3  0x7f109a81d06b      in  ???() at ???:0
#4  0x7f109a81e431      in  ???() at ???:0
#5  0x7f109a81eb47      in  ???() at ???:0
#6  0x7f109a820335      in  ???() at ???:0
#7  0x7f109a82140b      in  ???() at ???:0
#8  0x7f109a9f08ff      in  ???() at ???:0#9  0x7f109aa2e48d      in  ???() at ???:0
#10 0x7f109aa2a6c4      in  ???() at ???:0
#11 0x563bc9ddf50c      in  tls_net_handshake() at src/tls/openssl.c:981
#12 0x563bc9de0214      in  flb_tls_session_create() at src/tls/flb_tls.c:556
#13 0x563bc9deee82      in  flb_io_net_connect() at src/flb_io.c:175
#14 0x563bc9dc6c40      in  create_conn() at src/flb_upstream.c:610
#15 0x563bc9dc6c40      in  flb_upstream_conn_get() at src/flb_upstream.c:799
#16 0x563bc9e7e1a7      in  ensure_container() at plugins/out_azure_blob/azure_blob.c:546
#17 0x563bc9e7e1a7      in  cb_azure_blob_flush() at plugins/out_azure_blob/azure_blob.c:1046
#18 0x563bca2f8c26      in  co_init() at lib/monkey/deps/flb_libco/amd64.c:117
#19 0xffffffffffffffff  in  ???() at ???:0

Expected behavior

Fluent-bit should write the access log entry to the blob container

Screenshots

Your Environment

  • Version used: 3.2.0 and 3.2.2
  • Configuration:
[SERVICE]
    Flush                 5
    Log_Level             info
    Daemon                off
    Parsers_File          parsers.conf
    Parsers_File          custom_parsers.conf
    # required for readiness and liveness probes
    HTTP_Server           On
    HTTP_Listen           0.0.0.0
    HTTP_PORT             2020

[INPUT]
    Name                  tail
    Tag                   kube.*
    Path                  /var/log/containers/*.log
    Exclude_Path          *kube-*,*azure-*,*blackbox-exporter-*,*blobfuse-flexvol-*,*coredns-*,*istio-*,*node-exporter-*,*keyvault-flexvolume-*,*kiali-*,*fluent-bit*
    Parser                cri_o
    DB                    /var/log/flb_kube.db
    Mem_Buf_Limit         10MB
    Skip_Long_Lines       On
    Refresh_Interval      10
    Inotify_Watcher       ${INOTIFY_WATCHER_ENABLED}

[INPUT]
    Name                  tail
    Tag                   kube-ingress-controller
    Path                  /var/log/containers/*ingress-nginx-controller*.log
    Parser                cri_o
    DB                    /var/log/flb_kube_ingress_controller.db
    Mem_Buf_Limit         10MB
    Skip_Long_Lines       On
    Refresh_Interval      10
    Inotify_Watcher       ${INOTIFY_WATCHER_ENABLED}

# note the order of the filters.
# the lua filters must come after the kubernetes filter, otherwise the log field isn't yet decoded
# when reaching the lua code.
[FILTER]
    Name                  kubernetes
    Match                 kube.*
    Kube_URL              https://kubernetes.default.svc.cluster.local:443
    Merge_Log             On
    Merge_Log_Trim        On
    Keep_Log              Off
    K8S-Logging.Parser    On
    K8S-Logging.Exclude   On
    Buffer_Size           1Mb

[FILTER]
    Name                  lua
    Match                 *
    script                ../scripts/extract_tag.lua
    call                  extract_tag

[FILTER]
    Name                  lua
    Match                 kube*crs-server*
    script                ../scripts/crs.lua
    call                  augment_crs

[FILTER]
    Name                  record_modifier
    Match                 *
    Record                resource_group prod2402130227

[OUTPUT]
    Name                  es
    Match                 kube*
    Host                  ${ELASTICSEARCH_HOST_PREFIX}-es-http
    Port                  9200
    HTTP_User             ${ELASTIC_USERNAME}
    HTTP_Passwd           ${ELASTIC_PASSWORD}
    Buffer_Size           1Mb
    Logstash_Format       On
    Logstash_Prefix       fluent-bit
    Type                  _doc
    Replace_Dots          On
    Time_Key              @ts
    Time_Key_Format       %Y-%m-%dT%H:%M:%S
    tls                   On
    tls.verify            Off
    Suppress_Type_Name    On
    Trace_Error           On

[OUTPUT]
    name                  azure_blob
    match                 kube-ingress-controller
    account_name          ${LOG_STORAGE_ACCOUNT}
    shared_key            ${LOG_STORAGE_ACCOUNT_KEY}
    container_name        access-logs
    blob_type             blockblob
    auto_create_container on
    tls                   on
  • Environment name and version (e.g. Kubernetes? What version?):
    Kubernetes-1.30.5
  • Server type and version:
    Azure Standard_E4as_v5
  • Operating System and version:
    Ubuntu-22.04.5 LTS with kernel 5.15.0-1074-azure
  • Filters and plugins:
    (see config)

Additional context

We are running our application on an AKS cluster with kubernetes version 1.30.5.
There are 14 Nodes running currently, having 32GB of memory and 4 CPUs each.
The idea for using the azure_blob plugin is to output a copy of the ingress access logs to a blob storage
On clusters where we don't use the plugin fluent-bit runs just fine.
Likewise, version 3.1.10 runs fine on the same system with the exact same configuration
Fluent-bit is deployed using helm through an umbrella chart with release 0.48.3 from https://fluent.github.io/helm-charts as a dependency.

Metadata

Metadata

Labels

waiting-for-releaseThis has been fixed/merged but it's waiting to be included in a release.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions