Skip to content

[beats receivers] Surface which output configuration caused the collector to exit #9771

@cmacknz

Description

@cmacknz

Today if a Beat has an elasticsearch output with an invalid Elasticsearch output SSL configuration, the Beat sub-process will fail to start. The elastic-agent status output and Fleet health report will indicate that the component (which always has only one output) is failed allowing users to tell which output caused the problem.

For example, an elastic-agent.yml with the following configuration where "/etc/client/cert.pem" does not exist will fail to start:

outputs:
  broken:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    api_key: "example-key"
    ssl:
      certificate: "/etc/client/cert.pem"
      key: "/etc/client/cert.key"

inputs:
  - type: system/metrics
    id: unique-system-metrics-input
    use_output: broken
    streams:
      - metricsets:
        - cpu

agent.monitoring:
  enabled: false

The elastic-agent status output will show the system/metrics input with the broken output name as failed with a clear error.

❯ sudo elastic-development-agent status
┌─ fleet
│  └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
   ├─ status: (DEGRADED) 1 or more components/units in a failed state
   └─ system/metrics-broken
      ├─ status: (HEALTHY) Healthy: communicating with pid '14069'
      ├─ system/metrics-broken
      │  └─ status: (FAILED) could not start output: failed to reload output: open /etc/client/cert.pem: no such file or directory /etc/client/cert.pem accessing 'elasticsearch'
      └─ system/metrics-broken-unique-system-metrics-input
         └─ status: (STARTING) Starting

With the switch to using a collector auth extension to use the Beats HTTP transport in elastic/opentelemetry-collector-components#722 with the Elasticsearch exporter, the collector will instead exit with the error associated with the failing extension.

An equivalent collector configuration that looks like the following will cause the collector to exit:

receivers:
 filelog:
   include_file_name: true
   include:
     - "./otlp-all.json"

extensions:
 beatsauth:
   ssl:
    enabled: true
    verification_mode: none
    certificate: "/etc/client/cert.pem"
    key: "/etc/client/cert.key"
   timeout: 9s

exporters:
 elasticsearch:
  endpoints:
   -  https://localhost:9200
  password: testing
  user: admin
  auth:
   authenticator: beatsauth

service:
 extensions: [beatsauth]
 pipelines:
   logs:
     receivers: [filelog]
     processors: []
     exporters: [elasticsearch]

The error in this case is more vague and associated with the entire collector process:

2025-09-05T15:19:18.682-0400    error   service@v0.130.0/service.go:187 error found during service initialization   {"resource": {"service.instance.id": "12bc8224-d5a1-48e9-9422-2744a923b584", "service.name": "elastic-collector-components", "service.version": "0.0.1"}, "error": "failed to build extensions: failed to create extension \"beatsauth\": failed unpacking config: open /etc/client/cert.pem: no such file or directory /etc/client/cert.pem accessing config"}
go.opentelemetry.io/collector/service.New.func1
        go.opentelemetry.io/collector/service@v0.130.0/service.go:187
go.opentelemetry.io/collector/service.New
        go.opentelemetry.io/collector/service@v0.130.0/service.go:223
go.opentelemetry.io/collector/otelcol.(*Collector).setupConfigurationComponents
        go.opentelemetry.io/collector/otelcol@v0.130.0/collector.go:197
go.opentelemetry.io/collector/otelcol.(*Collector).Run
        go.opentelemetry.io/collector/otelcol@v0.130.0/collector.go:312
go.opentelemetry.io/collector/otelcol.NewCommand.func1
        go.opentelemetry.io/collector/otelcol@v0.130.0/command.go:39
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/cobra@v1.9.1/command.go:1015
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/cobra@v1.9.1/command.go:1148
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/cobra@v1.9.1/command.go:1071
main.runInteractive
        github.com/elastic/opentelemetry-collector-components/main.go:58
main.run
        github.com/elastic/opentelemetry-collector-components/main_others.go:10
main.main
        github.com/elastic/opentelemetry-collector-components/main.go:51
runtime.main
        runtime/proc.go:285
Error: failed to build extensions: failed to create extension "beatsauth": failed unpacking config: open /etc/client/cert.pem: no such file or directory /etc/client/cert.pem accessing config
2025/09/05 15:19:18 collector server run finished with error: failed to build extensions: failed to create extension "beatsauth": failed unpacking config: open /etc/client/cert.pem: no such file or directory /etc/client/cert.pem accessing config

When we execute the collector as a sub-process we will need some way to get this error to surface in the health report to Fleet, and associate it back to the originating output.

We discussed what to do about this in the beats receiver meeting today and concluded:

Sub-issues

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions