Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flow: naming convention for components #2059

Closed
rfratto opened this issue Aug 23, 2022 · 9 comments
Closed

Flow: naming convention for components #2059

rfratto opened this issue Aug 23, 2022 · 9 comments
Labels
frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. proposal Proposal or RFC

Comments

@rfratto
Copy link
Member

rfratto commented Aug 23, 2022

Background

If you include #1872 and #1843, this is our current list of Flow components:

  • discovery.k8s
  • integrations.node_exporter
  • local.file
  • metrics.mutate
  • metrics.remote_write
  • metrics.scrape
  • otel.exporter_jaeger
  • otel.exporter_otlp
  • otel.exporter_zipkin
  • otel.receiver_jaeger
  • otel.receiver_otlp
  • otel.receiver_zipkin
  • remote.s3
  • targets.mutate

To speed up development time, these components were initially named without much consideration to how they interact with other components.

We will have many more components as time goes on, and so we have a growing need for a naming convention to align related components.

Namespaces

Every component is in a "namespace" where related components are placed. For example, the remote.s3 component above is in the remote namespace.

Namespaces may be nested when appropriate. For example, an a.b.c component is in the a.b namespace (e.g., the b namespace inside the a namespace). Nesting a namespace is appropriate when components in a namespace can be divided into multiple responsibilities.

Proposal

Namespaces should:

  1. Group components by the data they act on in a pipeline
  2. Be named with singular nouns

Components should:

  1. Be named with a verb if they transform data
  2. Otherwise, either verbs or nouns are acceptable

I am proposing our list of components becomes:

  • discovery.k8s
  • discovery.relabel
  • local.file
  • otelcol.exporter.jaeger
  • otelcol.exporter.otlp
  • otelcol.exporter.zipkin
  • otelcol.receiver.jaeger
  • otelcol.receiver.otlp
  • otelcol.receiver.zipkin
  • prometheus.integration.node_exporter
  • prometheus.relabel
  • prometheus.scrape
  • prometheus.remote_write
  • remote.s3

discovery namespace

The discovery namespace holds components which expose targets. Its name is unchanged from the existing namespace.

The existing targets.mutate component is renamed to discovery.relabel, since it also has the purpose of exposing targets to other components in the pipeline.

local namespace

The local namespace holds components which expose data from the local machine. Its name is unchanged from the existing namespace.

otelcol namespace

The otelcol namespace holds components which expose OpenTelemetry Collector components as Flow components. It was renamed from the otel namespace to make the intent clearer; "otel" is a generic term for all of OpenTelemetry, and should be avoided.

To avoid polluting the otelcol namespace, it is broken up into two nested namespaces: otelcol.receiver for OpenTelemetry Collector receivers, and otelcol.exporter for OpenTelemetry Collector exporters. The otelcol.processor namespace will be introduced when processor components are added.

prometheus namespace

The prometheus namespace holds components which expose a pipeline of Prometheus metrics, from scraping to remote_write. It is renamed from the metrics namespace.

The specific name prometheus is chosen to make it more obvious to users what type of data is being acted on. This is related to #2037, which discusses using Prometheus directly instead of a generic abstraction for metrics.

It is composed of a nested namespace, prometheus.integration, described below.

prometheus.integration namespace

The prometheus.integration namespace holds Prometheus exporters. It is renamed from the integrations namespace.

Naming it prometheus.exporter was considered, but that risks confusing users since Prometheus exporters and OpenTelemetry Collector exporters serve two different functions: Prometheus exporters expose data, OpenTelemetry Collector exporters write data.

Most integrations in the static mode Grafana Agent would belong to this namespace. Some static mode integrations, like app_o11y_receiver and eventhandler would not belong in this namespace and would instead need to be placed elsewhere.

remote namespace

The remote namespace holds components which expose data from remote APIs or machines. Its name is unchanged from the existing namespace.

@rfratto rfratto added proposal Proposal or RFC flow labels Aug 23, 2022
@mattdurham
Copy link
Collaborator

For Group components by the data they act on in a pipeline are we talking about the conceptual type of data or the wire type of data? IE let's say we had a component that takes in graphite metrics. Conceptually both prometheus and graphite work on metrics, but their line-level implementation is different.

@rfratto
Copy link
Member Author

rfratto commented Aug 23, 2022

IE let's say we had a component that takes in graphite metrics.

We'd put those in a graphite namespace, in that case.

@rfratto
Copy link
Member Author

rfratto commented Aug 23, 2022

My opinion is written more clearly in #2037, but I don't think we should have a generic "metrics" namespace exactly because not all metrics are the same. I think when I say Group components by the data they act on in a pipeline, I'm talking mainly about the data format, so that all Prometheus-specific components are put in their own group, ditto with components which expose service discovery target objects.

@tpaschalis
Copy link
Member

I like this approach and separation. I think it will make it easier for people looking how to construct an end-to-end Prometheus pipeline to know where to look.

One question is that the current components under the discovery namespace are still close to the Prometheus way of working. Do we think we'd be able to reuse them for otelcol and logs in the future?

@mattdurham
Copy link
Collaborator

I am a bit torn on this, working off conceptual datatypes, helps with the findability of components. If someone is coming to use the agent without much knowledge of telemetry otel,prometheus,graphite, loki and tempo are all foreign concepts. Metrics, traces, and logs are clearer and also provide some guidance on how to connect them.

The user can intuit that a metrics.graphite receiver can connect to a metrics.remote_write. Whereas it's harder to intuit graphite.receiver can connect to prometheus.remote_write.

That being said if you know what you are looking for then this product based grouping is easier.

@rfratto
Copy link
Member Author

rfratto commented Aug 24, 2022

One question is that the current components under the discovery namespace are still close to the Prometheus way of working. Do we think we'd be able to reuse them for otelcol and logs in the future?

Yeah, it's the Prometheus SD model, which many things reuse. Not all OpenTelemetry components would be able to consume the targets, but the Prometheus SD processor would.

@rfratto
Copy link
Member Author

rfratto commented Aug 24, 2022

I understand the temptation for trying to make generic signal namespaces, but it's a technical challenge I don't think we're ready to handle.

The user can intuit that a metrics.graphite receiver can connect to a metrics.remote_write. Whereas it's harder to intuit graphite.receiver can connect to prometheus.remote_write.

What I'm implying here and in #2037 is that a graphite.reciever couldn't connect to prometheus.remote_write because they're two different data formats. If we want users to be able to convert Graphite data into Prometheus data, we would need a component which can handle the conversion.

My biggest concern with making all metrics-related components compatible with each other is defining an abstraction which supports all of the functionality needed for anything consuming that abstraction. It will take a significant amount of work to define a metric type which OTLP, Graphite, Prometheus, and any future metrics component can all use without any information loss.

Allowing different data formats to define different namespaces means that the fully native path remains 100% compliant with upstream and aren't subject to any data loss. The user would then have to explicitly opt-in to conversion, which allows them to make a conscious decision and be aware of the side effects that may happen when crossing data format barriers.

@mattdurham
Copy link
Collaborator

This sounds reasonable. We can make it implicit when the conversion happens.

@rfratto
Copy link
Member Author

rfratto commented Sep 20, 2022

All components have been renamed to follow this convention. Closing as done.

@rfratto rfratto closed this as completed Sep 20, 2022
@github-actions github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Feb 22, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. proposal Proposal or RFC
Projects
None yet
Development

No branches or pull requests

3 participants