diff --git a/docs/sources/tempo/configuration/grafana-agent/_index.md b/docs/sources/tempo/configuration/grafana-agent/_index.md index 39bcac7c83b..821950b0dba 100644 --- a/docs/sources/tempo/configuration/grafana-agent/_index.md +++ b/docs/sources/tempo/configuration/grafana-agent/_index.md @@ -14,6 +14,12 @@ aliases: collector for sending metrics, logs, and trace data to the opinionated Grafana observability stack. +{{< admonition type="note">}} +Grafana Alloy provides tooling to convert your Agent Static or Flow configuration files into a format that can be used by Alloy. + +For more information, refer to [Migrate to Alloy](https://grafana.com/docs/tempo//configuration/grafana-alloy/migrate-alloy). +{{< /admonition>}} + It's commonly used as a tracing pipeline, offloading traces from the application and forwarding them to a storage backend. Grafana Agent tracing stack is built using OpenTelemetry. diff --git a/docs/sources/tempo/configuration/grafana-agent/automatic-logging.md b/docs/sources/tempo/configuration/grafana-agent/automatic-logging.md index dfc309b9c4a..f70d6102ccf 100644 --- a/docs/sources/tempo/configuration/grafana-agent/automatic-logging.md +++ b/docs/sources/tempo/configuration/grafana-agent/automatic-logging.md @@ -9,6 +9,8 @@ aliases: # Automatic logging: Trace discovery through logs +{{< docs/shared source="alloy" lookup="agent-deprecation.md" version="next" >}} + Running instrumented distributed systems is a very powerful way to gain understanding over a system, but it brings its own challenges. One of them is discovering which traces exist. @@ -19,15 +21,15 @@ Automatic logging provides an easy and fast way of discovering trace IDs through log messages. Well-formatted log lines are written to a Loki instance or to `stdout` for each span, root, or process that passes through the tracing pipeline. This allows for automatically building a mechanism for trace -discovery. On top of that, we also get metrics from traces using Loki, and +discovery. On top of that, you can also get metrics from traces using Loki, and allow quickly jumping from a log message to the trace view in Grafana. -While this approach is useful, it isn't as powerful as [TraceQL]({{< relref -"../../traceql" >}}). If you are here because you know you want to log the +While this approach is useful, it isn't as powerful as TraceQL. +If you are here because you know you want to log the trace ID, to enable jumping from logs to traces, then read on! If you want to query the system directly, read the [TraceQL -documentation]({{< relref "../../traceql" >}}). +documentation](https://grafana.com/docs/tempo//traceql). ## Configuration @@ -41,11 +43,17 @@ This allows searching by those key-value pairs in Loki. ## Before you begin +{{< admonition type="note">}} +Grafana Alloy provides tooling to convert your Agent Static or Flow configuration files into a format that can be used by Alloy. + +For more information, refer to [Migrate to Alloy](https://grafana.com/docs/tempo//configuration/grafana-alloy/migrate-alloy). +{{< /admonition>}} + To configure automatic logging, you need to select your preferred backend and the trace data to log. -To see all the available config options, refer to the [configuration reference](/docs/agent/latest/configuration/traces-config). +To see all the available configuration options, refer to the [configuration reference](https://grafana.com/docs/agent/latest/configuration/traces-config). -This simple example logs trace roots to stdout and is a good way to get started using automatic logging: +This simple example logs trace roots to `stdout` and is a good way to get started using automatic logging: ```yaml traces: configs: diff --git a/docs/sources/tempo/configuration/grafana-agent/service-graphs.md b/docs/sources/tempo/configuration/grafana-agent/service-graphs.md index 702d15181b0..d6e7ff4037c 100644 --- a/docs/sources/tempo/configuration/grafana-agent/service-graphs.md +++ b/docs/sources/tempo/configuration/grafana-agent/service-graphs.md @@ -9,6 +9,8 @@ aliases: # Enable service graphs +{{< docs/shared source="alloy" lookup="agent-deprecation.md" version="next" >}} + A service graph is a visual representation of the interrelationships between various services. Service graphs help to understand the structure of a distributed system, and the connections and dependencies between its components. @@ -26,9 +28,13 @@ Service graphs are generated in Grafana Agent and pushed to a Prometheus-compati Once generated, they can be represented in Grafana as a graph. You need these components to fully use service graphs. -### Enable service graphs in Grafana Agent +{{< admonition type="note">}} +Grafana Alloy provides tooling to convert your Agent Static or Flow configuration files into a format that can be used by Alloy. -{{< docs/shared source="alloy" lookup="agent-deprecation.md" version="next" >}} +For more information, refer to [Migrate to Alloy](https://grafana.com/docs/tempo//configuration/grafana-alloy/migrate-alloy). +{{< /admonition>}} + +### Enable service graphs in Grafana Agent To start using service graphs, enable the feature in Grafana Agent configuration. diff --git a/docs/sources/tempo/configuration/grafana-agent/span-metrics.md b/docs/sources/tempo/configuration/grafana-agent/span-metrics.md index 68b283b7922..d23769edee3 100644 --- a/docs/sources/tempo/configuration/grafana-agent/span-metrics.md +++ b/docs/sources/tempo/configuration/grafana-agent/span-metrics.md @@ -9,6 +9,8 @@ aliases: # Generate metrics from spans +{{< docs/shared source="alloy" lookup="agent-deprecation.md" version="next" >}} + Span metrics allow you to generate metrics from your tracing data automatically. Span metrics aggregates request, error and duration (RED) metrics from span data. Metrics are exported in Prometheus format. @@ -26,7 +28,13 @@ The generated metrics show application-level insight into your monitoring, as far as tracing gets propagated through your applications. Span metrics are also used in the service graph view. -For more information, refer to the [service graph view]({{< relref "../../metrics-generator/service-graph-view" >}}). +For more information, refer to the [service graph view](https://grafana.com/docs/tempo//metrics-generator/service-graph-view/). + +{{< admonition type="note">}} +Grafana Alloy provides tooling to convert your Agent Static or Flow configuration files into a format that can be used by Alloy. + +For more information, refer to [Migrate to Alloy](https://grafana.com/docs/tempo//configuration/grafana-alloy/migrate-alloy). +{{< /admonition>}} ## Server-side metrics diff --git a/docs/sources/tempo/configuration/grafana-agent/tail-based-sampling.md b/docs/sources/tempo/configuration/grafana-agent/tail-based-sampling.md index 9a674bfc149..65354e9676a 100644 --- a/docs/sources/tempo/configuration/grafana-agent/tail-based-sampling.md +++ b/docs/sources/tempo/configuration/grafana-agent/tail-based-sampling.md @@ -9,6 +9,8 @@ aliases: # Tail-based sampling +{{< docs/shared source="alloy" lookup="agent-deprecation.md" version="next" >}} + Tempo aims to provide an inexpensive solution that makes 100% sampling possible. However, sometimes constraints make a lower sampling percentage necessary or desirable, such as runtime or egress traffic related costs. @@ -57,9 +59,13 @@ If you're using a multi-instance deployment of the agent, add load balancing and specify the resolving mechanism to find other Agents in the setup. To see all the available configuration options, refer to the [configuration reference](/docs/agent/latest/configuration/traces-config/). -## Example for Grafana Agent Flow +{{< admonition type="note">}} +Grafana Alloy provides tooling to convert your Agent Static or Flow configuration files into a format that can be used by Alloy. -{{< docs/shared source="alloy" lookup="agent-deprecation.md" version="next" >}} +For more information, refer to [Migrate to Alloy](https://grafana.com/docs/tempo//configuration/grafana-alloy/migrate-alloy). +{{< /admonition>}} + +### Example for Grafana Agent Flow [Grafana Agent Flow](/docs/agent/latest/flow/) is a component-based revision of Grafana Agent with a focus on ease-of-use, debuggability, and ability to adapt to the needs of power users. Flow configuration files are written in River instead of YAML. diff --git a/docs/sources/tempo/configuration/grafana-alloy/_index.md b/docs/sources/tempo/configuration/grafana-alloy/_index.md new file mode 100644 index 00000000000..db4efaacf35 --- /dev/null +++ b/docs/sources/tempo/configuration/grafana-alloy/_index.md @@ -0,0 +1,141 @@ +--- +title: Grafana Alloy +description: Configure the Grafana Alloy to work with Tempo +weight: 550 +aliases: +- /docs/tempo/grafana-alloy +--- + +# Grafana Alloy + +Grafana Alloy offers native pipelines for OTel, Prometheus, Pyroscope, Loki, and many other metrics, logs, traces, and profile tools. +In addition, you can use Alloy pipelines to do other tasks, such as configure alert rules in Loki and Mimir. Alloy is fully compatible with the OTel Collector, Prometheus Agent, and Promtail. + +You can use Alloy as an alternative to either of these solutions or combine it into a hybrid system of multiple collectors and agents. +You can deploy Alloy anywhere within your IT infrastructure and pair it with your Grafana LGTM stack, a telemetry backend from Grafana Cloud, or any other compatible backend from any other vendor. +Alloy is flexible, and you can easily configure it to fit your needs in on-prem, cloud-only, or a mix of both. + +It's commonly used as a tracing pipeline, offloading traces from the +application and forwarding them to a storage backend. + +Grafana Alloy configuration files are written in the [Alloy configuration syntax]([https://grafana.com/docs/agent/latest/flow/concepts/config-language/](https://grafana.com/docs/alloy/latest/concepts/configuration-syntax/)). + +For more information, refer to the [Introduction to Grafana Alloy](https://grafana.com/docs/alloy/latest/introduction). + +## Architecture + +Grafana Alloy can be configured to run a set of tracing pipelines to collect data from your applications and write it to Tempo. +Pipelines are built using OpenTelemetry, and consist of `receivers`, `processors`, and `exporters`. +The architecture mirrors that of the OTel Collector's [design](https://github.com/open-telemetry/opentelemetry-collector/blob/846b971758c92b833a9efaf742ec5b3e2fbd0c89/docs/design.md). +See the [components reference](https://grafana.com/docs/alloy/latest/reference/components/) for all available configuration options. + +

Tracing pipeline architecture

+ +This lets you configure multiple distinct tracing +pipelines, each of which collects separate spans and sends them to different +backends. + +### Receiving traces + +Grafana Alloy supports multiple ingestion receivers: +OTLP (OpenTelemetry), Jaeger, Zipkin, OpenCensus, and Kafka. + + +Each tracing pipeline can be configured to receive traces in all these formats. +Traces that arrive to a pipeline go through the receivers/processors/exporters defined in that pipeline. + +### Pipeline processing + +Grafana Alloy processes tracing data as it flows through the pipeline to make the distributed tracing system more reliable and leverage the data for other purposes such as trace discovery, tail-based sampling, and generating metrics. + +#### Batching + +Alloy supports batching of traces. +Batching helps better compress the data, reduces the number of outgoing connections, and is a recommended best practice. +To configure it, refer to the `otelcol.processor.batch` block in the [components reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.processor.batch/). + +#### Attributes manipulation + +Grafana Alloy allows for general manipulation of attributes on spans that pass through it. +A common use may be to add an environment or cluster variable. +There are several processors that can manipulate attributes, some examples include: the `otelcol.processor.attributes` block in the [component reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.processor.attributes/) and the `otelcol.processor.transform` block [component reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.processor.transform/) + +#### Attaching metadata with Prometheus Service Discovery + +Prometheus Service Discovery mechanisms enable you to attach the same metadata to your traces as your metrics. +For example, for Kubernetes users this means that you can dynamically attach metadata for namespace, Pod, and name of the container sending spans. + + +```alloy +otelcol.receiver.otlp "default" { + http {} + grpc {} + + output { + traces = [otelcol.processor.k8sattributes.default.input] + } +} + +otelcol.processor.k8sattributes "default" { + extract { + metadata = [ + "k8s.namespace.name", + "k8s.pod.name", + "k8s.container.name" + ] + } + + output { + traces = [otelcol.exporter.otlp.default.input] + } +} + +otelcol.exporter.otlp "default" { + client { + endpoint = env("OTLP_ENDPOINT") + } +} +``` + +Refer to the `otelcol.processor.k8sattributes` block in the [components reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.processor.k8sattributes/). + +#### Trace discovery through automatic logging + +Automatic logging writes well formatted log lines to help with trace discovery. + +For a closer look into the feature, visit [Automatic logging](https://grafana.com/docs/tempo//configuration/grafana-alloy/automatic-logging/). + +#### Tail-based sampling + +Alloy implements tail-based sampling for distributed tracing systems and multi-instance Alloy deployments. +With this feature, you can make sampling decisions based on data from a trace, rather than exclusively with probabilistic methods. + +For a detailed description, refer to [Tail-based sampling](https://grafana.com/docs/tempo//configuration/grafana-alloy/tail-based-sampling). + +#### Generating metrics from spans + +Alloy can take advantage of the span data flowing through the pipeline to generate Prometheus metrics. + +Refer to [Span metrics](https://grafana.com/docs/tempo//configuration/grafana-alloy/span-metrics/) for a more detailed explanation of the feature. + +#### Service graph metrics + +Service graph metrics represent the relationships between services within a distributed system. + +This service graphs processor builds a map of services by analyzing traces, with the objective to find _edges_. +Edges are spans with a parent-child relationship, that represent a jump, such as a request, between two services. +The amount of requests and their duration are recorded as metrics, which are used to represent the graph. + +To read more about this processor, go to its [section](https://grafana.com/docs/tempo//configuration/grafana-alloy/service-graphs). + +### Exporting spans + +Alloy can export traces to multiple different backends for every tracing pipeline. +Exporting is built using OpenTelemetry Collector's [OTLP exporter](https://github.com/open-telemetry/opentelemetry-collector/blob/846b971758c92b833a9efaf742ec5b3e2fbd0c89/exporter/otlpexporter/README.md). +Alloy supports exporting tracing in OTLP format. + +Aside from endpoint and authentication, the exporter also provides mechanisms for retrying on failure, +and implements a queue buffering mechanism for transient failures, such as networking issues. + +To see all available options, +refer to the `otelcol.exporter.otlp` block in the [Alloy configuration reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.exporter.otlp/) and the `otelcol.exporter.otlphttp` block in the [Alloy configuration reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.exporter.otlphttp/). diff --git a/docs/sources/tempo/configuration/grafana-alloy/automatic-logging-example-query.png b/docs/sources/tempo/configuration/grafana-alloy/automatic-logging-example-query.png new file mode 100644 index 00000000000..45f4d92c0cb Binary files /dev/null and b/docs/sources/tempo/configuration/grafana-alloy/automatic-logging-example-query.png differ diff --git a/docs/sources/tempo/configuration/grafana-alloy/automatic-logging-example-results.png b/docs/sources/tempo/configuration/grafana-alloy/automatic-logging-example-results.png new file mode 100644 index 00000000000..aec3b66912b Binary files /dev/null and b/docs/sources/tempo/configuration/grafana-alloy/automatic-logging-example-results.png differ diff --git a/docs/sources/tempo/configuration/grafana-alloy/automatic-logging.md b/docs/sources/tempo/configuration/grafana-alloy/automatic-logging.md new file mode 100644 index 00000000000..6172b077e8a --- /dev/null +++ b/docs/sources/tempo/configuration/grafana-alloy/automatic-logging.md @@ -0,0 +1,116 @@ +--- +title: 'Automatic logging: Trace discovery through logs' +description: Automatic logging provides an easy and fast way of getting trace discovery through logs. +menuTitle: Automatic logging +weight: 200 +aliases: +- /docs/tempo/grafana-alloy/automatic-logging +--- + +# Automatic logging: Trace discovery through logs + +Running instrumented distributed systems is a very powerful way to gain +understanding over a system, but it brings its own challenges. One of them is +discovering which traces exist. + +Using the span logs connector, you can use Alloy to perform automatic logging. + +In the beginning of Tempo, querying for a trace was only possible if you knew +the ID of the trace you were looking for. One solution was automatic logging. +Automatic logging provides an easy and fast way of discovering trace IDs +through log messages. +Well-formatted log lines are written to a logs exporter +for each span, root, or process that passes through the tracing +pipeline. This allows for automatically building a mechanism for trace discovery. +On top of that, you can also get metrics from traces using a logs source, and +allow quickly jumping from a log message to the trace view in Grafana. + +While this approach is useful, it isn't as powerful as TraceQL. +If you are here because you know you want to log the +trace ID, to enable jumping from logs to traces, then read on. + +If you want to query the system directly, read the [TraceQL +documentation](https://grafana.com/docs/tempo//traceql). + +## Configuration + +For high throughput systems, logging for every span may generate too much volume. +In such cases, logging per root span or process is recommended. + +

Automatic logging overview

+ +Automatic logging searches for a given set of span or resource attributes in the spans and logs them as key-value pairs. +This allows searching by those key-value pairs in Loki. + +## Before you begin + +To configure automatic logging, you need to configure the `otelcol.connector.spanlogs` connector with +appropriate options. + +To see all the available configuration options, refer to the `otelcol.connector.spanlogs` [components reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.connector.spanlogs/). + +This simple example logs trace roots before exporting them to the Grafana OTLP gateway, +and is a good way to get started using automatic logging: + +```alloy +otelcol.receiver.otlp "default" { + grpc {} + http {} + + output { + traces = [otelcol.connector.spanlogs.default.input] + } +} + +otelcol.connector.spanlogs "default" { + roots = true + + output { + logs = [otelcol.exporter.otlp.default.input] + } +} + +otelcol.exporter.otlp "default" { + client { + endpoint = env("OTLP_ENDPOINT") + } +} +``` + +This example logs all trace roots, adding the `http.method` and `http.target` attributes to the log line, +then pushes logs to a local Loki instance: + +```alloy +otelcol.receiver.otlp "default" { + grpc {} + http {} + + output { + traces = [otelcol.connector.spanlogs.default.input] + } +} + +otelcol.connector.spanlogs "default" { + roots = true + span_attributes = ["http.method", "http.target"] + + output { + logs = [otelcol.exporter.loki.default.input] + } +} + +otelcol.exporter.loki "default" { + forward_to = [loki.write.local.receiver] +} + +loki.write "local" { + endpoint { + url = "loki:3100" + } +} +``` + +## Examples + +

Automatic logging overview

+

Automatic logging overview

diff --git a/docs/sources/tempo/configuration/grafana-alloy/migrate-alloy.md b/docs/sources/tempo/configuration/grafana-alloy/migrate-alloy.md new file mode 100644 index 00000000000..79b9c8d782b --- /dev/null +++ b/docs/sources/tempo/configuration/grafana-alloy/migrate-alloy.md @@ -0,0 +1,27 @@ +--- +title: Migrate to Alloy +description: Provides links to documentation to migrate to Grafana Alloy. +weight: 100 +--- + +# Migrate to Alloy + +Grafana Alloy is the new name for the Grafana Labs distribution of the OpenTelemetry collector. +Grafana Agent Static, Grafana Agent Flow, and Grafana Agent Operator have been deprecated and are in Long-Term Support (LTS) through October 31, 2025. They will reach an End-of-Life (EOL) on November 1, 2025. +Grafana Labs has provided tools and migration documentation to assist you in migrating to Grafana Alloy. + +Read more about why we recommend migrating to [Grafana Alloy](https://grafana.com/blog/2024/04/09/grafana-alloy-opentelemetry-collector-with-prometheus-pipelines/). + +This section provides links to documentation for how to migrate to Alloy. + +- [Migrate from Grafana Agent Static](https://grafana.com/docs/alloy/latest/tasks/migrate/from-static/) + +- [Migrate from Grafana Agent Flow](https://grafana.com/docs/alloy/latest/tasks/migrate/from-flow/) + +- [Migrate from Grafana Agent Operator](https://grafana.com/docs/alloy/latest/tasks/migrate/from-operator/) + +- [Migrate from OpenTelemetry Collector](https://grafana.com/docs/alloy/latest/tasks/migrate/from-otelcol/) + +- [Migrate from Prometheus](https://grafana.com/docs/alloy/latest/tasks/migrate/from-prometheus/) + +- [Migrate from Promtail](https://grafana.com/docs/alloy/latest/tasks/migrate/from-promtail/) diff --git a/docs/sources/tempo/configuration/grafana-alloy/service-graphs.md b/docs/sources/tempo/configuration/grafana-alloy/service-graphs.md new file mode 100644 index 00000000000..00677ca4553 --- /dev/null +++ b/docs/sources/tempo/configuration/grafana-alloy/service-graphs.md @@ -0,0 +1,73 @@ +--- +title: Enable service graphs +menuTitle: Enable service graphs +description: Service graphs help to understand the structure of a distributed system, and the connections and dependencies between its components. +weight: +aliases: + - ../../grafana-alloy/service-graphs/ # /docs/tempo//grafana-alloy/service-graphs/ +--- + +# Enable service graphs + +A service graph is a visual representation of the interrelationships between various services. +Service graphs help to understand the structure of a distributed system, +and the connections and dependencies between its components. + +The same service graph metrics can also be generated by Tempo. +This is more efficient and recommended for larger installations. +For a deep look into service graphs, visit [this section](https://grafana.com/docs/tempo//metrics-generator/service_graphs). + +Service graphs are also used in the application performance management dashboard. +For more information, refer to the [service graph view documentation](https://grafana.com/docs/tempo//metrics-generator/service_graph-view). + +## Before you begin + +Service graphs are generated in Grafana Alloy and pushed to a Prometheus-compatible backend. +Once generated, they can be represented in Grafana as a graph. +You need these components to fully use service graphs. + +### Enable service graphs in Grafana Alloy + +{{< docs/shared source="alloy" lookup="agent-deprecation.md" version="next" >}} + +To start using service graphs, enable the feature in the Alloy configuration. + +The following example adds the `http.method` and `http.target` span attributes as Prometheus labels +to the generated service graph metrics, before writing the metrics to the Grafana OTLP gateway. +Received trace spans are immediately written to the OTLP gateway. + +```alloy +otelcol.receiver.otlp "default" { + grpc {} + http {} + + output { + traces = [ + otelcol.connector.servicegraph.default.input, + otelcol.exporter.otlp.default.input + ] + } +} + +otelcol.connector.servicegraph "default" { + dimensions = ["http.method", "http.target"] + output { + metrics = [otelcol.exporter.otlp.default.input] + } +} + +otelcol.exporter.otlp "default" { + client { + endpoint = env("OTLP_ENDPOINT") + } +} +``` + +To see all the available configuration options, refer to the [component reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.connector.servicegraph/). + +### Grafana + +The same service graph metrics can also be generated by Tempo. +This is more efficient and recommended for larger installations. + +For additional information about viewing service graph metrics in Grafana and calculating cardinality, refer to the [server side documentation]({https://grafana.com/docs/tempo//metrics-generator/service_graphs/enable-service-graphs). diff --git a/docs/sources/tempo/configuration/grafana-alloy/span-metrics.md b/docs/sources/tempo/configuration/grafana-alloy/span-metrics.md new file mode 100644 index 00000000000..d09f1ef4274 --- /dev/null +++ b/docs/sources/tempo/configuration/grafana-alloy/span-metrics.md @@ -0,0 +1,98 @@ +--- +title: Generate metrics from spans +menuTitle: Generate metrics from spans +description: Span metrics allow you to generate metrics from your tracing data automatically. +weight: +aliases: +- /docs/tempo/grafana-alloy/span-metrics +--- + +# Generate metrics from spans + +Span metrics allow you to generate metrics from your tracing data automatically. +Span metrics aggregates request, error and duration (RED) metrics from span data. +Metrics are exported in Prometheus format. + +There are two options available for exporting metrics: using remote write to a Prometheus compatible backend or serving the metrics locally and scraping them. + +Span metrics generate two metrics: a counter that computes requests, and a histogram that computes operation’s durations. + +Span metrics are of particular interest if your system isn't monitored with metrics, +but it has distributed tracing implemented. +You get out-of-the-box metrics from your tracing pipeline. + +Even if you already have metrics, span metrics can provide in-depth monitoring of your system. +The generated metrics show application-level insight into your monitoring, +as far as tracing gets propagated through your applications. + +To generate span metrics within Grafana Alloy, you can use the `otelcol.connector.spanmetrics` component. +The following example: +* Adds the `http.method` (with a default value of `GET`) and `http.target` span attributes as Prometheus labels + to the generated span metrics +* Sets an explicit set of histogram buckets intervals. +* Specifies a metrics flush period of 15 seconds. +* Uses the `traces_spanmetrics` namespace to prefix all generated metrics with. +before writing the metrics to the Grafana OTLP gateway. +Received trace spans are immediately written to the OTLP gateway. + +```alloy +otelcol.receiver.otlp "default" { + http {} + grpc {} + + output { + traces = [ + otelcol.connector.spanmetrics.default.input, + otelcol.exporter.otlp.default.input + ] + } +} + +otelcol.connector.spanmetrics "default" { + dimension { + name = "http.method" + default = "GET" + } + + dimension { + name = "http.target" + } + + aggregation_temporality = "DELTA" + + histogram { + explicit { + buckets = ["50ms", "100ms", "250ms", "1s", "5s", "10s"] + } + } + + metrics_flush_interval = "15s" + + namespace = "traces_spanmetrics" + + output { + metrics = [otelcol.exporter.otlp.default.input] + } +} + +otelcol.exporter.otlp "default" { + client { + endpoint = env("OTLP_ENDPOINT") + } +} +``` + +Span metrics are also used in the service graph view. +For more information, refer to the [service graph view](https://grafana.com/docs/tempo//metrics-generator/service_graph-view). + +To see all the available configuration options, refer to the [component reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.connector.spanmetrics/). + +## Server-side metrics + +The same span metrics can also be generated by the metrics-generator within Tempo. +This is more efficient and recommended for larger installations. +For more information, refer to the [span metrics](https://grafana.com/docs/tempo//metrics-generator/span_metrics) documentation. + +## Example + +

Span metrics overview

diff --git a/docs/sources/tempo/configuration/grafana-alloy/tail-based-sampling.md b/docs/sources/tempo/configuration/grafana-alloy/tail-based-sampling.md new file mode 100644 index 00000000000..5586ded80ce --- /dev/null +++ b/docs/sources/tempo/configuration/grafana-alloy/tail-based-sampling.md @@ -0,0 +1,127 @@ +--- +title: Enable tail-based sampling +menuTItle: Enable tail-based sampling +description: Use tail-based sampling to optimize sampling decisions +weight: +aliases: +- /docs/tempo/grafana-alloy/tail-based-sampling +--- + +# Enable tail-based sampling + +Tempo provides an inexpensive solution that aims to reduce the amount of tail-based sampling required. +However, sometimes constraints make a lower sampling percentage necessary or desirable, +such as runtime or egress traffic related costs. +Probabilistic sampling strategies are easy to implement, +but also run the risk of discarding relevant data that you'll later want. + +Tail-based sampling works with Grafana Alloy. +Alloy configuration files are written in [Alloy configuration syntax](https://grafana.com/docs/alloy/latest/concepts/configuration-syntax/). + +## How tail-based sampling works + +In tail-based sampling, sampling decisions are made at the end of the workflow allowing for a more accurate sampling decision. +Alloy groups spans by trace ID and checks its data to see +if it meets one of the defined policies (for example, `latency` or `status_code`). +For instance, a policy can check if a trace contains an error or if it took +longer than a certain duration. + +A trace is sampled if it meets at least one policy. + +To group spans by trace ID, Alloy buffers spans for a configurable amount of time, +after which it considers the trace complete. +Longer running traces are split into more than one. +However, waiting longer times increases the memory overhead of buffering. + +One particular challenge of grouping trace data is for multi-instance Alloy deployments, +where spans that belong to the same trace can arrive to different Alloys. +To solve that, you can configure Alloy to load balance traces across Alloy instances +by exporting spans belonging to the same trace to the same instance. + +This is achieved by redistributing spans by trace ID once they arrive from the application. +Alloy must be able to discover and connect to other Alloy instances where spans for the same trace can arrive. +Kubernetes users should use a [headless service](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services). + +Redistributing spans by trace ID means that spans are sent and received twice, +which can cause a significant increase in CPU usage. +This overhead increases with the number of Alloy instances that share the same traces. + +

Tail-based sampling overview

+ +## Configure tail-based sampling + +To start using tail-based sampling, define a sampling policy in your configuration file. + +If you're using a multi-instance deployment of Alloy, +add load balancing and specify the resolving mechanism to find other Alloy instances in the setup. + +To see all the available configuration options for load balancing, refer to the [Alloy component reference](https://grafana.com/docs/alloy/latest/reference/components/otelcol.exporter.loadbalancing/). + +### Example for Alloy + +Alloy uses the [`otelcol.processor.tail_sampling component`](https://grafana.com/docs/alloy/latest/reference/components/otelcol.processor.tail_sampling/) for tail-based sampling. + +```alloy +otelcol.receiver.otlp "default" { + http {} + grpc {} + + output { + traces = [otelcol.processor.tail_sampling.policies.input] + } +} + +// The Tail Sampling processor will use a set of policies to determine which received +// traces to keep and send to Tempo. +otelcol.processor.tail_sampling "policies" { + // Total wait time from the start of a trace before making a sampling decision. + // Note that smaller time periods can potentially cause a decision to be made + // before the end of a trace has occurred. + decision_wait = "30s" + + // The following policies follow a logical OR pattern, meaning that if any of the + // policies match, the trace will be kept. For logical AND, you can use the `and` + // policy. Every span of a trace is examined by each policy in turn. A match will + // cause a short-circuit. + + // This policy defines that traces that contain errors should be kept. + policy { + // The name of the policy can be used for logging purposes. + name = "sample-erroring-traces" + // The type must match the type of policy to be used, in this case examining + // the status code of every span in the trace. + type = "status_code" + // This block determines the error codes that should match in order to keep + // the trace, in this case the OpenTelemetry 'ERROR' code. + status_code { + status_codes = [ "ERROR" ] + } + } + + // This policy defines that only traces that are longer than 200ms in total + // should be kept. + policy { + // The name of the policy can be used for logging purposes. + name = "sample-long-traces" + // The type must match the policy to be used, in this case the total latency + // of the trace. + type = "latency" + // This block determines the total length of the trace in milliseconds. + latency { + threshold_ms = 200 + } + } + + // The output block forwards the kept traces onto the batch processor, which + // will marshall them for exporting to the Grafana OTLP gateway. + output { + traces = [otelcol.exporter.otlp.default.input] + } +} + +otelcol.exporter.otlp "default" { + client { + endpoint = env("OTLP_ENDPOINT") + } +} +``` diff --git a/docs/sources/tempo/configuration/grafana-alloy/tempo-auto-log.svg b/docs/sources/tempo/configuration/grafana-alloy/tempo-auto-log.svg new file mode 100644 index 00000000000..6cee35c19a0 --- /dev/null +++ b/docs/sources/tempo/configuration/grafana-alloy/tempo-auto-log.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/sources/tempo/configuration/grafana-alloy/tempo-tail-based-sampling.svg b/docs/sources/tempo/configuration/grafana-alloy/tempo-tail-based-sampling.svg new file mode 100644 index 00000000000..a0b91224c65 --- /dev/null +++ b/docs/sources/tempo/configuration/grafana-alloy/tempo-tail-based-sampling.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/sources/tempo/configuration/querying.md b/docs/sources/tempo/configuration/use-trace-data.md similarity index 53% rename from docs/sources/tempo/configuration/querying.md rename to docs/sources/tempo/configuration/use-trace-data.md index d2220c18a0f..0f3a6c3eff5 100644 --- a/docs/sources/tempo/configuration/querying.md +++ b/docs/sources/tempo/configuration/use-trace-data.md @@ -1,20 +1,21 @@ --- -title: Use Tempo with Grafana -menuTitle: Use Tempo with Grafana -description: Learn how to configure and query Tempo with Grafana. +title: Use tracing data in Grafana +menuTitle: Use tracing data in Grafana +description: Learn how to configure and query Tempo data with Grafana. +aliases: +- ./querying/ # /docs/tempo//configuration/querying weight: 900 --- - +# Use tracing data in Grafana -# Use Tempo with Grafana - -You can use Tempo as a data source in Grafana to Tempo can query Grafana directly. Grafana Cloud comes pre-configured with a Tempo data source. +You can use Tempo as a data source in Grafana. +Grafana Cloud comes pre-configured with a Tempo data source. If you are using Grafana on-prem, you need to [set up the Tempo data source](/docs/grafana//datasources/tempo). {{< admonition type="tip" >}} -If you want to see what you can do with tracing data in Grafana, try the [Intro to Metrics, Logs, Traces, and Profiling example]({{< relref "../getting-started/docker-example" >}}). +If you want to explore tracing data in Grafana, try the [Intro to Metrics, Logs, Traces, and Profiling example]({{< relref "../getting-started/docker-example" >}}). {{% /admonition %}} This video explains how to add data sources, including Loki, Tempo, and Mimir, to Grafana and Grafana Cloud. Tempo data source set up starts at 4:58 in the video. @@ -23,7 +24,7 @@ This video explains how to add data sources, including Loki, Tempo, and Mimir, t ## Configure the data source -For detailed instructions on the Tempo dta source in Grafana, refer to [Tempo data source](https://grafana.com/docs/grafana//datasources/tempo/). +For detailed instructions on the Tempo data source in Grafana, refer to [Tempo data source](https://grafana.com/docs/grafana//datasources/tempo/). To configure Tempo with Grafana: @@ -34,6 +35,6 @@ The port of 3200 is a common port used in our examples. Tempo default HTTP port ## Query the data source -Refer to [Tempo in Grafana]({{< relref "../getting-started/tempo-in-grafana" >}}) for an overview about how tracing data can be viewed and used in Grafana. +Refer to [Tempo in Grafana](https://grafana.com/docs/tempo//getting-started/tempo-in-grafana) for an overview about how tracing data can be viewed and queried in Grafana. For information on querying the Tempo data source, refer to [Tempo query editor](https://grafana.com/docs/grafana//datasources/tempo/query-editor/). \ No newline at end of file diff --git a/docs/sources/tempo/getting-started/_index.md b/docs/sources/tempo/getting-started/_index.md index 030f6725f3d..88bcb78aa9f 100644 --- a/docs/sources/tempo/getting-started/_index.md +++ b/docs/sources/tempo/getting-started/_index.md @@ -22,7 +22,7 @@ client instrumentation, pipeline, backend, and visualization. This diagram illustrates a tracing system configuration: -

Tracing pipeline overview

+

Tracing pipeline overview

## Client instrumentation @@ -34,7 +34,7 @@ create and offload spans. To learn more about instrumentation, read the [Instrument for tracing]({{< relref "./instrumentation" >}}) documentation to learn how to instrument your favorite language for distributed tracing. {{% /admonition %}} -## Pipeline (Grafana Agent) +## Pipeline (Grafana Alloy) Once your application is instrumented for tracing, the traces need to be sent to a backend for storage and visualization. You can build a tracing pipeline that @@ -42,20 +42,18 @@ offloads spans from your application, buffers them, and eventually forwards them Tracing pipelines are optional (most clients can send directly to Tempo), but the pipelines become more critical the larger and more robust your tracing system is. -Grafana Agent is a service that is deployed close to the application, either on the same node or +Grafana Alloy is a service that is deployed close to the application, either on the same node or within the same cluster (in Kubernetes) to quickly offload traces from the application and forward them to a storage backend. -Grafana Agent also abstracts features like trace batching to a remote trace backend store, including retries on write failures. +Alloy also abstracts features like trace batching to a remote trace backend store, including retries on write failures. -To learn more about Grafana Agent and how to set it up for tracing with Tempo, -refer to [Grafana Agent traces configuration docs](/docs/agent/latest/static/configuration/traces-config/). - -{{< docs/shared source="alloy" lookup="agent-deprecation.md" version="next" >}} +To learn more about Grafana Alloy and how to set it up for tracing with Tempo, +refer to [Grafana Alloy configuration for tracing]({{< relref "../configuration/grafana-alloy" >}}). {{< admonition type="note" >}} The [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) / [Jaeger Agent](https://www.jaegertracing.io/docs/latest/deployment/) can also be used at the agent layer. Refer to [this blog post](/blog/2021/04/13/how-to-send-traces-to-grafana-clouds-tempo-service-with-opentelemetry-collector/) -to see how the OpenTelemetry Collector can be used with Grafana Cloud Tempo. +to see how the OpenTelemetry Collector can be used with Tempo. {{% /admonition %}} ## Backend (Tempo) @@ -72,13 +70,11 @@ Next, review the [Setup documentation]({{< relref "../setup" >}}) for step-by-st Tempo offers different deployment options, depending upon your needs. Refer to the [plan your deployment]({{< relref "../setup/deployment" >}}) section for more information {{< admonition type="note" >}} -The Grafana Agent is already set up to use Tempo. -Refer to the [configuration](/docs/agent/latest/configuration/traces-config/) and [example](https://github.com/grafana/agent/blob/main/example/docker-compose/agent/config/agent.yaml) for details. +Grafana Alloy is already set up to use Tempo. +Refer to [Grafana Alloy configuration for tracing](https://grafana.com/docs/tempo//configuration/grafana-alloy). {{% /admonition %}} ## Visualization (Grafana) Grafana has a built-in Tempo data source that can be used to query Tempo and visualize traces. For more information, refer to the [Tempo data source](/docs/grafana/latest/datasources/tempo) and the [Tempo in Grafana]({{< relref "./tempo-in-grafana" >}}) topics. - -For more information, refer to the [Tempo in Grafana]({{< relref "./tempo-in-grafana" >}}) documentation. diff --git a/docs/sources/tempo/getting-started/assets/getting-started.png b/docs/sources/tempo/getting-started/assets/getting-started.png deleted file mode 100644 index 240c278c4f0..00000000000 Binary files a/docs/sources/tempo/getting-started/assets/getting-started.png and /dev/null differ diff --git a/docs/sources/tempo/getting-started/assets/tempo-get-started-overiew.svg b/docs/sources/tempo/getting-started/assets/tempo-get-started-overiew.svg new file mode 100644 index 00000000000..076c2f2bd38 --- /dev/null +++ b/docs/sources/tempo/getting-started/assets/tempo-get-started-overiew.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/sources/tempo/troubleshooting/unable-to-see-trace.md b/docs/sources/tempo/troubleshooting/unable-to-see-trace.md index 81b8dba4217..d0e034dd5e6 100644 --- a/docs/sources/tempo/troubleshooting/unable-to-see-trace.md +++ b/docs/sources/tempo/troubleshooting/unable-to-see-trace.md @@ -147,9 +147,6 @@ To fix connection issues: frontend_worker: frontend_address: query-frontend-discovery.default.svc.cluster.local:9095 ``` - - Confirm that the Grafana data source is configured correctly and debug network issues between Grafana and Tempo. To fix an insufficient permissions issue: