Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flow: add OpenTelemetry Collector components #2213

Closed
21 tasks done
rfratto opened this issue Sep 24, 2022 · 10 comments · Fixed by #5008
Closed
21 tasks done

Flow: add OpenTelemetry Collector components #2213

rfratto opened this issue Sep 24, 2022 · 10 comments · Fixed by #5008
Labels
flow Related to Grafana Agent Flow frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. proposal Proposal or RFC proposal-accepted Proposal has been accepted.
Milestone

Comments

@rfratto
Copy link
Member

rfratto commented Sep 24, 2022

This issue proposes a set of otelcol components to add first-class OpenTelemetry Collector support in Grafana Agent Flow.

Adding first-class OpenTelemetry Collector support enables Grafana Agent to embrace Grafana's big tent philosophy, allowing the agent to support multiple vendors outside of the LGTM stack.

Proposal

We will create a set of OpenTelemetry Collector components across three namespaces:

  • otelcol.receiver: A list of OpenTelemetry Collector receivers
  • otelcol.processor: A list of OpenTelemetry Collector processors
  • otelcol.exporter: A list of OpenTelemetry Collector exporters

A description of how to handle extensions is out of scope for this proposal. However, extensions for at least authentication will be included.

NOTE: Throughout this proposal, I use the term "consumer" in the same sense as OpenTelemetry Collector; a consumer is a component which can consume OpenTelemetry data sent from other components.

Pipeline construction

Grafana Agent Flow creates a pipeline by having each component reference other components where data will be sent. This contrasts OpenTelemetry Collector, where components are defined separately from the pipeline.

For consistency with other Flow components, the set of otelcol components will form a pipeline the same way: by having each component reference other components where data will be sent.

Arguments

All otelcol Flow components which send data to a consumer will support the following arguments:

  • output.traces: A list of consumers where traces should be forwarded. Only available if the configured component supports traces.
  • output.metrics: A list of consumers where metrics should be forwarded. Only available if the configured component supports metrics.
  • output.logs: A list of consumers where logs should be forwarded. Only available if the configured component supports logs.

Exported fields

otelcol Flow components which act as consumers will support the following exported fields:

  • input: A consumer for telemetry data. Any telemetry type supported by the OTel component can be sent to this consumer.

Note that there is only one exported field for input for all telemetry types. This simplifies the config, and there doesn't seem to be much value in separating the fields.

Example config

otelcol.receiver.jaeger "default" {
  // Define the server settings where applications 
  // may push Jaeger-formatted traces to. 
  server {
    // Create a gRPC server with the default settings. 
    grpc {} 
  }

  output {  
    traces = [otelcol.processor.batch.default.input] 
  } 
}

otelcol.processor.batch "default" {
  output {
    traces = [otelcol.exporter.logging.default.input]
  }
}

otelcol.exporter.logging "default" {
  log_level = "debug"

  // Exporters cannot forward data to a consumer, 
  // so the "output" field does not exist here. 
}

Flow component list

We will cherry-pick specific OpenTelemetry Collector components from both the otelcol and otelcol-contrib distributions to include in Grafana Agent Flow.

Our initial set of components is minimal, but will grow over time. While we should consider supporting most existing OTel components, we should initially avoid supporting any OTel components whose functionality is covered by an existing Flow component. This will be the case for otelcol-contrib receivers which collide with functionality of prometheus.integration Flow components.

We will start with the following components:

otelcol distribution components:

  • otelcol.receiver.otlp: Receives OTLP data over the network.
  • otelcol.processor.batch: Batches telemetry data before sending to another consumer.
  • otelcol.processor.memory_limiter: Drops telemetry data if memory usage is getting too high.
  • otelcol.exporter.otlp: Sends telemetry data via gRPC using OTLP.
  • otelcol.exporter.otlphttp: Sends telemetry data via HTTP using OTLP.

otelcol-contrib: distribution components:

  • otelcol.receiver.jaeger: Receives telemetry data from Jaeger applications.
  • otelcol.receiver.kafka: Receives telemetry data from Kafka.
  • otelcol.receiver.opencensus: Receives telemetry data from OpenCensus.
  • otelcol.receiver.zipkin: Receives telemetry data from Zipkin.
  • otelcol.processor.attributes: Processes span attributes.
  • otelcol.processor.spanmetrics: Creates metrics from spans.
  • otelcol.processor.tail_sampling: Performs tail sampling against tracing data.
  • otelcol.exporter.jaeger: Writes telemetry data to a Jaeger server.

This list of components is based solely on the current exposed functionality of the tracing subsystem within Grafana Agent. As the intent is to embrace the big tent philosophy, we should continue to add more exporters for other vendors over time.

Additionally, we will create Grafana Agent-specific custom components:

  • otelcol.exporter.logging: Logs telemetry data in-process.
  • otelcol.processor.discovery: Associates incoming spans with a Prometheus discovery target.
  • otelcol.processor.servicegraph: Generates a service graph from incoming spans.
  • otelcol.receiver.prometheus: Receives Prometheus metrics from a prometheus.scrape Flow component.
  • otelcol.exporter.prometheus: Sends Prometheus metrics to a prometheus.remote_write Flow component.

The last two components, otelcol.receiver.prometheus and otelcol.exporter.prometheus, enable interoperability between the Prometheus-specific and OpenTelemetry-specific components, allowing users to freely switch between both as needed.

Implementation

Flow otelcol.* components should work by wrapping around upstream components from OpenTelemetry Collector, using River for configuration. We should not reimplement components that already exist, with the except for the Grafana Agent-specific custom components.

An initial prototype for this was done in #1843. The prototype can serve as a base for the final implementation.

Tasks

This is the list of components to implement for the initial set of components:

These components will be gradually introduced across a few releases.

@rfratto rfratto added proposal Proposal or RFC flow labels Sep 24, 2022
@rfratto rfratto added this to the v0.29.0 milestone Sep 24, 2022
rfratto added a commit to rfratto/agent that referenced this issue Sep 27, 2022
This commit introduces the component/otelcol package, which will
eventually hold a collection of `otelcol.*` components.

Initial prototyping work for Flow OpenTelemetry Collector components was
done in grafana#1843, which demonstrated that the full set of code needed to
implement OpenTelemetry Components is quite large. I will be splitting
up the needed code across a few changes; this is the first.

This initial commit starts setting up the framework for running
OpenTelemetry Collector components inside of Flow with a
`componentScheduler` struct.

Related to grafana#2213.
rfratto added a commit that referenced this issue Sep 27, 2022
…onents (#2224)

* component/otelcol: initial commit

This commit introduces the component/otelcol package, which will
eventually hold a collection of `otelcol.*` components.

Initial prototyping work for Flow OpenTelemetry Collector components was
done in #1843, which demonstrated that the full set of code needed to
implement OpenTelemetry Components is quite large. I will be splitting
up the needed code across a few changes; this is the first.

This initial commit starts setting up the framework for running
OpenTelemetry Collector components inside of Flow with a
`componentScheduler` struct.

Related to #2213.

* move scheduler to an internal/scheduler package.

The otelcol component scheduler code will need to be exposed to multiple
packages, but it doesn't make sense to make it part of the public API.

* fix linting errors
rfratto added a commit to rfratto/agent that referenced this issue Sep 27, 2022
This commit introduces a new package, component/otelcol/exporter, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector exporters.

There is some stuff left to do for this implementation to be complete:

* A Zap logging adapter needs to be created to correctly process logs
  from OpenTelemetry Collector components.
* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

All of the above will be done in separate PRs.

As of this commit, there are no registered `otelcol.exporter.*`
components. Implementations for OpenTelemetry Collector Flow components
will be done in future PRs.

Related to grafana#2213.
rfratto added a commit that referenced this issue Sep 28, 2022
)

* component/otelcol/exporter: initial commit

This commit introduces a new package, component/otelcol/exporter, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector exporters.

There is some stuff left to do for this implementation to be complete:

* A Zap logging adapter needs to be created to correctly process logs
  from OpenTelemetry Collector components.
* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

All of the above will be done in separate PRs.

As of this commit, there are no registered `otelcol.exporter.*`
components. Implementations for OpenTelemetry Collector Flow components
will be done in future PRs.

Related to #2213.

* switch to using JSON reprsentation for test traces

* fix typo

* scheduler: clarify comment

* Update component/otelcol/internal/scheduler/scheduler.go

* scheduler: make customizing exports/extensions options on NewHost

* fix flaky test
rfratto added a commit to rfratto/agent that referenced this issue Sep 30, 2022
This commit introduces a new package, component/otelcol/receiver, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector receiver.

Like grafana#2227, it leaves some work unfinished for future PRs:

* A Zap logging adapter needs to be created to correctly process logs
  from OpenTelemetry Collector components.
* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.receiver.*`
components. Implementations for OpenTelemetry Collector Flow components
will be done in future PRs.

Related to grafana#2213.
rfratto added a commit that referenced this issue Oct 3, 2022
)

This commit introduces a new package, component/otelcol/receiver, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector receiver.

Like #2227, it leaves some work unfinished for future PRs:

* A Zap logging adapter needs to be created to correctly process logs
  from OpenTelemetry Collector components.
* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.receiver.*`
components. Implementations for OpenTelemetry Collector Flow components
will be done in future PRs.

Related to #2213.
@rfratto rfratto moved this to In Progress in Grafana Agent (Public) Oct 3, 2022
@rfratto rfratto self-assigned this Oct 3, 2022
@rfratto rfratto removed the flow label Oct 3, 2022
rfratto added a commit to rfratto/agent that referenced this issue Oct 3, 2022
This commit introduces a new package, component/otelcol/processor, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector processor.

Like grafana#2227 and grafana#2254, it leaves some work unfinished for future PRs:

* A Zap logging adapter needs to be created to correctly process logs
  from OpenTelemetry Collector components.
* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.processor.*`
components. Implementations for OpenTelemetry Collector Flow components
will be done in future PRs.

Related to grafana#2213.
rfratto added a commit to rfratto/agent that referenced this issue Oct 3, 2022
This commit introduces a new package, component/otelcol/processor, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector processor.

Like grafana#2227 and grafana#2254, it leaves some work unfinished for future PRs:

* A Zap logging adapter needs to be created to correctly process logs
  from OpenTelemetry Collector components.
* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.processor.*`
components. Implementations for OpenTelemetry Collector Flow components
will be done in future PRs.

Related to grafana#2213.
rfratto added a commit to rfratto/agent that referenced this issue Oct 3, 2022
…ents

This introduces component/otelcol/internal/zapadapter, which creates a
*zap.Logger instance from a github.com/go-kit/log.Logger instance. This
is then used when creating OpenTelemetry Collector components, allowing
us to continue to use github.com/go-kit/log consistently throughout
Flow.

Related to grafana#2213.
rfratto added a commit to rfratto/agent that referenced this issue Oct 4, 2022
This commit introduces a new package, component/otelcol/processor, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector processor.

Like grafana#2227 and grafana#2254, it leaves some work unfinished for future PRs:

* A Zap logging adapter needs to be created to correctly process logs
  from OpenTelemetry Collector components.
* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.processor.*`
components. Implementations for OpenTelemetry Collector Flow components
will be done in future PRs.

Related to grafana#2213.
rfratto added a commit that referenced this issue Oct 4, 2022
…2284)

This commit introduces a new package, component/otelcol/processor, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector processor.

Like #2227 and #2254, it leaves some work unfinished for future PRs:

* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.processor.*`
components. Implementations for OpenTelemetry Collector Flow components
will be done in future PRs.

Related to #2213.
@rfratto
Copy link
Member Author

rfratto commented Oct 11, 2022

OpenTelemetry Collector extensions will not be exposed as Flow components.

I'm on the fence about this now. Authentication is handled by extensions, and the way we wrap components doesn't make it easy for otelcol Flow component wrappers to handle their own authentication (see #2343).

There seems to be two choices here:

  • Match upstream: Flow components should match upstream behavior and introduce components like otelcol.extension.basic_auth which are then wired up to other components which support authentication.
    • This allows otelcol components to match more closely with upstream.
    • This is easier to implement since it's a direct parallel to the thing we're wrapping.
  • Match existing Flow components: Bake in authentication support
    • This allows otelcol components to match more closely with prometheus components.
    • This is more challenging to implement.

rfratto added a commit to rfratto/agent that referenced this issue Oct 12, 2022
OpenTelemetry supports many extensions. Extensions are used as a generic
way to put "everything else" into OpenTelemetry. There are two types of
extensions relevant to us:

* [Authentication extensions][auth-ext]: used for both client and server
  authentication.
* [Storage extensions][storage-ext]: used for external storage of state.

Other extensions, such as [awsproxy][] are useful but better suited as
generic Flow components rather than being shoved in the otelcol
namespace, since they are unrelated to telemetry pipelines and aren't
referenced by other otelcol components in the upstream configuration.

This commit introduces a new package, component/otelcol/auth, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector extensions meant for authentication.

While storage extensions may end up being Flow components eventually,
it's currently marked as experimental upstream. We will reevaluate
storage extension components once things have stabilized a little more.

Like grafana#2227, grafana#2254, and grafana#2284, it leaves some work unfinished for future
PRs:

* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.auth.*` components.
Implementations for OpenTelemetry Collector Flow components will be done
in future PRs.

Related to grafana#2213.

[auth-ext]: https://pkg.go.dev/go.opentelemetry.io/collector@v0.61.0/config/configauth
[storage-ext]: https://pkg.go.dev/go.opentelemetry.io/collector/extension/experimental/storage
[awsproxy]: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.61.0/extension/awsproxy
rfratto added a commit to rfratto/agent that referenced this issue Oct 12, 2022
OpenTelemetry supports many extensions. Extensions are used as a generic
way to put "everything else" into OpenTelemetry. To quote their
[documentation][ext-docs]:

> Extension is the interface for objects hosted by the OpenTelemetry
> Collector that don't participate directly on data pipelines but provide
> some functionality to the service, examples: health check endpoint,
> z-pages, etc.

There are two types of extensions relevant to us:

* [Authentication extensions][auth-ext]: used for both client and server
  authentication.
* [Storage extensions][storage-ext]: used for external storage of state.

Other extensions, such as [awsproxy][] are useful but better suited as
generic Flow components rather than being shoved in the otelcol
namespace, since they are unrelated to telemetry pipelines and aren't
referenced by other otelcol components in the upstream configuration.

This commit introduces a new package, component/otelcol/auth, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector extensions meant for authentication.

While storage extensions may end up being Flow components eventually,
it's currently marked as experimental upstream. We will reevaluate
storage extension components once things have stabilized a little more.

Like grafana#2227, grafana#2254, and grafana#2284, it leaves some work unfinished for future
PRs:

* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.auth.*` components.
Implementations for OpenTelemetry Collector Flow components will be done
in future PRs.

Related to grafana#2213.

[ext-docs]: https://pkg.go.dev/go.opentelemetry.io/collector@v0.61.0/component#Extension
[auth-ext]: https://pkg.go.dev/go.opentelemetry.io/collector@v0.61.0/config/configauth
[storage-ext]: https://pkg.go.dev/go.opentelemetry.io/collector/extension/experimental/storage
[awsproxy]: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.61.0/extension/awsproxy
rfratto added a commit to rfratto/agent that referenced this issue Oct 12, 2022
OpenTelemetry supports many extensions. Extensions are used as a generic
way to put "everything else" into OpenTelemetry. To quote their
[documentation][ext-docs]:

> Extension is the interface for objects hosted by the OpenTelemetry
> Collector that don't participate directly on data pipelines but provide
> some functionality to the service, examples: health check endpoint,
> z-pages, etc.

There are two types of extensions relevant to us:

* [Authentication extensions][auth-ext]: used for both client and server
  authentication.
* [Storage extensions][storage-ext]: used for external storage of state.

Other extensions, such as [awsproxy][] are useful but better suited as
generic Flow components rather than being shoved in the otelcol
namespace, since they are unrelated to telemetry pipelines and aren't
referenced by other otelcol components in the upstream configuration.

This commit introduces a new package, component/otelcol/auth, which
exposes a generic Flow component implementation which can run
OpenTelemetry Collector extensions meant for authentication.

While storage extensions may end up being Flow components eventually,
it's currently marked as experimental upstream. We will reevaluate
storage extension components once things have stabilized a little more.

Like grafana#2227, grafana#2254, and grafana#2284, it leaves some work unfinished for future
PRs:

* Component-specific metrics are currently ignored.
* Component-specific traces are currently ignored.

As of this commit, there are no registered `otelcol.auth.*` components.
Implementations for OpenTelemetry Collector Flow components will be done
in future PRs.

Related to grafana#2213.

[ext-docs]: https://pkg.go.dev/go.opentelemetry.io/collector@v0.61.0/component#Extension
[auth-ext]: https://pkg.go.dev/go.opentelemetry.io/collector@v0.61.0/config/configauth
[storage-ext]: https://pkg.go.dev/go.opentelemetry.io/collector/extension/experimental/storage
[awsproxy]: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.61.0/extension/awsproxy
@rfratto rfratto modified the milestones: v0.30.0, v0.31.0 Jan 3, 2023
@rfratto rfratto added flow Related to Grafana Agent Flow flow/feature-parity labels Jan 19, 2023
@rfratto rfratto modified the milestones: v0.31.0, v0.32.0 Jan 30, 2023
@rfratto rfratto modified the milestones: v0.32.0, v0.33.0 Feb 28, 2023
@rfratto rfratto modified the milestones: v0.33.0, v0.34.0 Apr 26, 2023
@mattdurham mattdurham added the proposal-accepted Proposal has been accepted. label May 17, 2023
@ptodev
Copy link
Contributor

ptodev commented Jun 6, 2023

The tracing subsystem in Static mode also supports "load_balancing" and "automatic_logging". Are we going to offer feature parity on those?

The Static mode uses the Collector's loadbalancing exporter directly. Should we add it ot the list of components which we need to port over?

For automatic logging I cannot find an equivalent Collector process so we might have to port our own.

@rfratto
Copy link
Member Author

rfratto commented Jun 6, 2023

The tracing subsystem in Static mode also supports "load_balancing" and "automatic_logging". Are we going to offer feature parity on those?

For automatic_logging: otelcol.exporter.logging was supposed to be the equivalent of automatic_logging, even if they don't emit exactly the same format.

For load_balancing: yes, that needs to be added to the list; it was something I missed originally.

@sylvainOL
Copy link

Hello @rfratto, I tried otelcol.exporter.logging with normal (just number of traces processed) and detailed (there were \nso I figured out it wasn't good).

So with detailed it should work with Loki (another person in our team is moving from elasticsearch to loki) and Grafana?

thanks for the tool!

@cyrille-leclerc
Copy link

cyrille-leclerc commented Jun 8, 2023

@zeitlinger and I faced the same \n as @sylvainOL to verify few OTel logs, it forced me to copy paste the message in a text editor to replace it by real line breaks.
I wish the console output I use for debugging honored line breaks

@ptodev
Copy link
Contributor

ptodev commented Jun 26, 2023

Hi @sylvainOL and @cyrille-leclerc, thank you for your feedback. I opened another issue for this: #4261
If you have further comments, please feel free to write them in the dedicated issue.

@sylvainOL
Copy link

Hi @ptodev, thanks for the comment, I'll follow it :)

@rfratto
Copy link
Member Author

rfratto commented Jul 27, 2023

@kago-dk We're open to it, but this issue is just to track the initial set of OpenTelemetry Collector components which are already available in static mode; for anything net-new we'd want to track it in a separate issue. Would you mind opening a new issue requesting filter processor to be added to flow mode?

@tpaschalis tpaschalis modified the milestones: v0.36.0, v0.37.0 Sep 5, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in Grafana Agent (Public) Sep 8, 2023
@github-actions github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Feb 21, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flow Related to Grafana Agent Flow frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. proposal Proposal or RFC proposal-accepted Proposal has been accepted.
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

7 participants