[error] [opentelemetry] snappy decompression failed issue #9737

3siksfather · 2024-12-17T08:59:46Z

Bug Report

Describe the bug
Actual Behavior:
Fluent Bit throws a snappy decompression failed error.
Metrics are missing in Grafana, likely due to issues with decompressing the data.

To Reproduce
fluent-bit pod
random time fail log
[error] [opentelemetry] snappy decompression failed
Screenshots

any all query

Your Environment
prometheus-server latest
fluent-bit latest
opentelemetry-collector latest

configmap
[SERVICE]
Flush 5
Daemon Off
Log_Level debug
Config_Watch On
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On

[INPUT]
name prometheus_remote_write
listen 0.0.0.0
port 8080

[OUTPUT]
name stdout
match *

[OUTPUT]
Name opentelemetry
Match *
Host opentelemetry-collector.default.svc.cluster.local
Port 4318

Additional context
Request:
Could you please help investigate the root cause of the snappy decompression failed error? Additionally, if there are any configuration changes or updates to the OpenTelemetry exporter or Fluent Bit that might resolve this issue, that would be helpful.

edsiper · 2024-12-17T16:11:20Z

pls share your full config, if you there is a way to get the payload that is generating the issue would be very helpful

3siksfather · 2024-12-18T04:38:23Z

@edsiper Is there a file or specific config file you want?
All configurations are the default configurations of the latest version of helm chart.
The only modified contents are the settings required for integration.

configmap

prometheus
remote_write
- url: "http://fluent-bit-metric.default.svc.cluster.local:8080"
write_relabel_configs:
- action: drop
regex: (~~~)
source_labels: [name]
- action: drop
regex: kubernetes-apiservers
source_labels: [job]
queue_config:
capacity: 4000 # default 2500
max_shards: 50 # default = 200
min_shards: 10 # default = 1
max_samples_per_send: 2000 # default = 500
batch_send_deadline: 15s # default = 5s
min_backoff: 30ms # default = 30ms
max_backoff: 100ms # default = 100ms
metadata_config:
send_interval: 10s # default = 1m

fluent-bit
same configmap

opentelemetry
processors:
batch: {}
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
receivers:
jaeger:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:14250
thrift_compact:
endpoint: ${env:MY_POD_IP}:6831
thrift_http:
endpoint: ${env:MY_POD_IP}:14268
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
endpoint: ${env:MY_POD_IP}:4318
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector
scrape_interval: 10s
static_configs:
- targets:
- ${env:MY_POD_IP}:8888
zipkin:
endpoint: ${env:MY_POD_IP}:9411
service:
extensions:

health_check
pipelines:
logs:
exporters:
- debug
  processors:
- memory_limiter
- batch
  receivers:
- otlp
  metrics:
  exporters:
- debug
  processors:
- memory_limiter
- batch
  receivers:
- otlp
- prometheus
  traces:
  exporters:
- debug
  processors:
- memory_limiter
- batch
  receivers:
- otlp
- jaeger
- zipkin
  telemetry:
  metrics:
  address: ${env:MY_POD_IP}:8888

3siksfather added the status: waiting-for-triage label Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[error] [opentelemetry] snappy decompression failed issue #9737

[error] [opentelemetry] snappy decompression failed issue #9737

3siksfather commented Dec 17, 2024

edsiper commented Dec 17, 2024

3siksfather commented Dec 18, 2024 •

edited

Loading

fluent-bit
same configmap

[error] [opentelemetry] snappy decompression failed issue #9737

[error] [opentelemetry] snappy decompression failed issue #9737

Comments

3siksfather commented Dec 17, 2024

Bug Report

edsiper commented Dec 17, 2024

3siksfather commented Dec 18, 2024 • edited Loading

fluent-bit same configmap

3siksfather commented Dec 18, 2024 •

edited

Loading

fluent-bit
same configmap