Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression_error and crashing #9050

Open
jparrattwork opened this issue Jun 29, 2023 · 3 comments
Open

Compression_error and crashing #9050

jparrattwork opened this issue Jun 29, 2023 · 3 comments
Labels
bug Something isn't working question Further information is requested

Comments

@jparrattwork
Copy link

jparrattwork commented Jun 29, 2023

Component(s)

No response

What happened?

Description

We have an application sending OTLP data to the collector via gRPC. We then send that from the collector to our Splunk backend. The collector seems to stop sending any logs/traces/metrics and both the collector and our application eventually crash.

Steps to Reproduce

Send OTLP via gRPC to the collector

Expected Result

The collector produces no warnings and doesn't crash

Actual Result

The collector prints the following warning over and over
warn zapgrpc/zapgrpc.go:195 [transport] transport: http2Server.HandleStreams failed to read frame: connection error: COMPRESSION_ERROR {"grpc_log": true}

and then eventually crashes

Collector version

v0.68.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04") Windows Server 2019
Compiler(if manually compiled): go version go1.19.5 windows/amd64

OpenTelemetry Collector configuration

extensions:
  # Enables health check endpoint for otel collector - https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension/healthcheckextension
  health_check:
  # Opens up zpages for dev/debugging - https://github.com/open-telemetry/opentelemetry-collector/tree/main/extension/zpagesextension
  zpages:
    endpoint: localhost:55679

receivers:
  # For dotnet apps
  otlp:
    protocols:
      grpc:
      http:

  # FluentD
  fluentforward:
    endpoint: 0.0.0.0:8006

  # Otel Internal Metrics
  prometheus:
    config:
      scrape_configs:
      - job_name: 'otelcol' # Gets mapped to service.name
        scrape_interval: 10s
        static_configs:
        - targets: ['0.0.0.0:8888']
  
  # System Metrics
  hostmetrics:
    collection_interval: 10s
    scrapers:
      cpu:
      disk:
      filesystem:
      memory:
      network:
      # System load average metrics https://en.wikipedia.org/wiki/Load_(computing)
      load:
      # Paging/Swap space utilization and I/O metrics
      paging:
      # Aggregated system process count metrics
      processes:
      # System processes metrics, disabled by default
      # process:  

processors:
  batch: # Batches data when sending
  resourcedetection:
    detectors: [gce, ecs, ec2, azure, system]
    timeout: 2s
    override: false
  transform/body-empty:
    log_statements:
      - context: log
        statements:
          - set(body, "body-empty") where body == nil
  groupbyattrs:
    keys:
    - service.name
    - service.version
    - host.name
  # Enabling the memory_limiter is strongly recommended for every pipeline.
  # Configuration is based on the amount of memory allocated to the collector.
  # For more information about memory limiter, see
  # https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/memorylimiter/README.md
  memory_limiter:
    check_interval: 2s
    limit_mib: 256              
 
exporters:
  splunk_hec/logs:
    token: hidden
    endpoint: hidden
    index: hidden
    max_connections: 20
    disable_compression: false
    timeout: 10s
    tls:
      insecure_skip_verify: true
      ca_file: ""
      cert_file: ""
      key_file: ""

  splunk_hec/traces:
    token: hidden
    endpoint: hidden
    index: hidden
    max_connections: 20
    disable_compression: false
    timeout: 10s
    tls:
      insecure_skip_verify: true
      ca_file: ""
      cert_file: ""
      key_file: ""
 
  splunk_hec/metrics:
    token: hidden
    endpoint: hidden
    index: hidden
    max_connections: 20
    disable_compression: false
    timeout: 10s
    tls:
      insecure_skip_verify: true
      ca_file: ""
      cert_file: ""
      key_file: ""      

service:
  # zpages port : 55679

  pipelines:
    logs:
      receivers: [otlp, fluentforward]
      processors: [resourcedetection, transform/body-empty, groupbyattrs, memory_limiter, batch]
      exporters: [splunk_hec/logs]
    metrics:
      receivers: [otlp, hostmetrics]
      processors: [resourcedetection, groupbyattrs, memory_limiter, batch]
      exporters: [splunk_hec/metrics]
    traces:
      receivers: [otlp]
      processors: [resourcedetection, groupbyattrs, memory_limiter, batch]
      exporters: [splunk_hec/traces]

Log output

1.6879002944284434e+09	warn	zapgrpc/zapgrpc.go:195	[transport] transport: http2Server.HandleStreams failed to read frame: connection error: COMPRESSION_ERROR	{"grpc_log": true}

Additional context

No response

@jparrattwork jparrattwork added the bug Something isn't working label Jun 29, 2023
@atoulme
Copy link
Contributor

atoulme commented Jun 30, 2023

Please try the latest release. Which distribution are you using? Which OS are you running on?

Please see here to elevate log level and troubleshoot further: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/troubleshooting.md

@bryan-aguilar bryan-aguilar added the question Further information is requested label Aug 11, 2023
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Nov 13, 2023
@atoulme
Copy link
Contributor

atoulme commented Dec 6, 2023

This might relate to #9022. I will transfer the issue over.

@atoulme atoulme transferred this issue from open-telemetry/opentelemetry-collector-contrib Dec 6, 2023
@github-actions github-actions bot removed the Stale label Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants