diff --git a/.github/workflows/docs-update.yml b/.github/workflows/docs-update.yml new file mode 100644 index 00000000000..2dfa9d52be1 --- /dev/null +++ b/.github/workflows/docs-update.yml @@ -0,0 +1,36 @@ +name: Update OpenTelemetry Website Docs + +on: + # triggers only on a manual dispatch + workflow_dispatch: + +jobs: + update-docs: + runs-on: ubuntu-latest + steps: + - name: checkout + uses: actions/checkout@v2.3.4 + - name: make-pr + env: + API_TOKEN_GITHUB: ${{secrets.DOC_UPDATE_TOKEN}} + # Destination repo should always be 'open-telemetry/opentelemetry.io' + DESTINATION_REPO: open-telemetry/opentelemetry.io + # Destination path should be the absolute path to your language's friendly name in the docs tree (i.e, 'content/en/docs/java') + DESTINATION_PATH: content/en/docs/collector + # Source path should be 'website_docs', all files and folders are copied from here to dest + SOURCE_PATH: website_docs + run: | + TARGET_DIR=$(mktemp -d) + export GITHUB_TOKEN=$API_TOKEN_GITHUB + git config --global user.name austinlparker + git config --global user.email austin@lightstep.com + git clone "https://$API_TOKEN_GITHUB@github.com/$DESTINATION_REPO.git" "$TARGET_DIR" + rsync -av --delete "$SOURCE_PATH/" "$TARGET_DIR/$DESTINATION_PATH/" + cd "$TARGET_DIR" + git checkout -b docs-$GITHUB_REPOSITORY-$GITHUB_SHA + git add . + git commit -m "Docs update from $GITHUB_REPOSITORY" + git push -u origin HEAD:docs-$GITHUB_REPOSITORY-$GITHUB_SHA + gh pr create -t "Docs Update from $GITHUB_REPOSITORY" -b "This is an automated pull request." -B main -H docs-$GITHUB_REPOSITORY-$GITHUB_SHA + echo "done" + \ No newline at end of file diff --git a/website_docs/_index.md b/website_docs/_index.md new file mode 100644 index 00000000000..a5124918ac9 --- /dev/null +++ b/website_docs/_index.md @@ -0,0 +1,25 @@ +--- +title: "Collector" +linkTitle: "Collector" +weight: 10 +description: > + + Vendor-agnostic way to receive, process and export telemetry data +--- + + + +The OpenTelemetry Collector offers a vendor-agnostic implementation on how to +receive, process and export telemetry data. It removes the need to run, +operate, and maintain multiple agents/collectors in order to support +open-source observability data formats (e.g. Jaeger, Prometheus, Fluent Bit, +etc.) sending to one or more open-source or commercial back-ends. The Collector +is the default location instrumentation libraries export their telemetry data. + +Objectives: + +- Usable: Reasonable default configuration, supports popular protocols, runs and collects out of the box. +- Performant: Highly stable and performant under varying loads and configurations. +- Observable: An exemplar of an observable service. +- Extensible: Customizable without touching the core code. +- Unified: Single codebase, deployable as an agent or collector with support for traces, metrics, and logs (future). diff --git a/website_docs/configuration.md b/website_docs/configuration.md new file mode 100644 index 00000000000..3067f3ed421 --- /dev/null +++ b/website_docs/configuration.md @@ -0,0 +1,453 @@ +--- +title: "Configuration" +weight: 20 +--- + +Please be sure to review the following documentation: + +- [Data Collection concepts](../../concepts/data-collection) in order to + understand the repositories applicable to the OpenTelemetry Collector. +- [Security + guidance](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security.md) + +## Basics + +The Collector consists of three components that access telemetry data: + +- +[Receivers](#receivers) +- +[Processors](#processors) +- +[Exporters](#exporters) + +These components once configured must be enabled via pipelines within the +[service](#service) section. + +Secondarily, there are [extensions](#extensions), which provide capabilities +that can be added to the Collector, but which do not require direct access to +telemetry data and are not part of pipelines. They are also enabled within the +[service](#service) section. + +An example configuration would look like: + +```yaml +receivers: + otlp: + protocols: + grpc: + http: + +processors: + batch: + +exporters: + otlp: + endpoint: otelcol:4317 + +extensions: + health_check: + pprof: + zpages: + +service: + extensions: [health_check,pprof,zpages] + pipelines: + traces: + receivers: [otlp] + processors: [batch] + exporters: [otlp] + metrics: + receivers: [otlp] + processors: [batch] + exporters: [otlp] + logs: + receivers: [otlp] + processors: [batch] + exporters: [otlp] +``` + +Note that the same receiver, processor, exporter and/or pipeline can be defined +more than once. For example: + +```yaml +receivers: + otlp: + protocols: + grpc: + http: + otlp/2: + protocols: + grpc: + endpoint: 0.0.0.0:55690 + +processors: + batch: + batch/test: + +exporters: + otlp: + endpoint: otelcol:4317 + otlp/2: + endpoint: otelcol2:4317 + +extensions: + health_check: + pprof: + zpages: + +service: + extensions: [health_check,pprof,zpages] + pipelines: + traces: + receivers: [otlp] + processors: [batch] + exporters: [otlp] + traces/2: + receivers: [otlp/2] + processors: [batch/test] + exporters: [otlp/2] + metrics: + receivers: [otlp] + processors: [batch] + exporters: [otlp] + logs: + receivers: [otlp] + processors: [batch] + exporters: [otlp] +``` + +## Receivers + + + + +A receiver, which can be push or pull based, is how data gets into the +Collector. Receivers may support one or more [data +sources](../../concepts/data-sources). + + +The `receivers:` section is how receivers are configured. Many receivers come +with default settings so simply specifying the name of the receiver is enough +to configure it (for example, `zipkin:`). If configuration is required or a +user wants to change the default configuration then such configuration must be +defined in this section. Configuration parameters specified for which the +receiver provides a default configuration are overridden. + +> Configuring a receiver does not enable it. Receivers are enabled via +> pipelines within the [service](#service) section. + +One or more receivers must be configured. By default, no receivers +are configured. A basic example of all available receivers is provided below. + +> For detailed receiver configuration, please see the [receiver +README.md](https://github.com/open-telemetry/opentelemetry-collector/blob/main/receiver/README.md). + +```yaml +receivers: + # Data sources: logs + fluentforward: + listenAddress: 0.0.0.0:8006 + + # Data sources: metrics + hostmetrics: + scrapers: + cpu: + disk: + filesystem: + load: + memory: + network: + process: + processes: + swap: + + # Data sources: traces + jaeger: + protocols: + grpc: + thrift_binary: + thrift_compact: + thrift_http: + + # Data sources: traces + kafka: + protocol_version: 2.0.0 + + # Data sources: traces, metrics + opencensus: + + # Data sources: traces, metrics, logs + otlp: + protocols: + grpc: + http: + + # Data sources: metrics + prometheus: + config: + scrape_configs: + - job_name: "otel-collector" + scrape_interval: 5s + static_configs: + - targets: ["localhost:8888"] + + # Data sources: traces + zipkin: +``` + +## Processors + + + +Processors are run on data between being received and being exported. +Processors are optional though [some are +recommended](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor#recommended-processors). + +The `processors:` section is how processors are configured. Processors may come +with default settings, but many require configuration. Any configuration for a +processor must be done in this section. Configuration parameters specified for +which the processor provides a default configuration are overridden. + +> Configuring a processor does not enable it. Processors are enabled via +> pipelines within the [service](#service) section. + +A basic example of all available processors is provided below. + +> For detailed processor configuration, please see the [processor +README.md](https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/README.md). + +```yaml +processors: + # Data sources: traces + attributes: + actions: + - key: environment + value: production + action: insert + - key: db.statement + action: delete + - key: email + action: hash + + # Data sources: traces, metrics, logs + batch: + + # Data sources: metrics + filter: + metrics: + include: + match_type: regexp + metric_names: + - prefix/.* + - prefix_.* + + # Data sources: traces, metrics, logs + memory_limiter: + ballast_size_mib: 2000 + check_interval: 5s + limit_mib: 4000 + spike_limit_mib: 500 + + # Data sources: traces + resource: + attributes: + - key: cloud.zone + value: "zone-1" + action: upsert + - key: k8s.cluster.name + from_attribute: k8s-cluster + action: insert + - key: redundant-attribute + action: delete + + # Data sources: traces + probabilistic_sampler: + hash_seed: 22 + sampling_percentage: 15 + + # Data sources: traces + span: + name: + to_attributes: + rules: + - ^\/api\/v1\/document\/(?P.*)\/update$ + from_attributes: ["db.svc", "operation"] + separator: "::" +``` + +## Exporters + + + +An exporter, which can be push or pull based, is how you send data to one or +more backends/destinations. Exporters may support one or more [data +sources](../../concepts/data-sources). + + +The `exporters:` section is how exporters are configured. Exporters may come +with default settings, but many require configuration to specify at least the +destination and security settings. Any configuration for an exporter must be +done in this section. Configuration parameters specified for which the exporter +provides a default configuration are overridden. + +> Configuring an exporter does not enable it. Exporters are enabled via +> pipelines within the [service](#service) section. + +One or more exporters must be configured. By default, no exporters +are configured. A basic example of all available exporters is provided below. + +> For detailed exporter configuration, please see the [exporter +README.md](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/README.md). + +```yaml +exporters: + # Data sources: traces, metrics, logs + file: + path: ./filename.json + + # Data sources: traces + jaeger: + endpoint: "http://jaeger-all-in-one:14250" + insecure: true + + # Data sources: traces + kafka: + protocol_version: 2.0.0 + + # Data sources: traces, metrics, logs + logging: + loglevel: debug + + # Data sources: traces, metrics + opencensus: + endpoint: "otelcol2:55678" + + # Data sources: traces, metrics, logs + otlp: + endpoint: otelcol2:4317 + insecure: true + + # Data sources: traces, metrics + otlphttp: + endpoint: https://example.com:55681/v1/traces + + # Data sources: metrics + prometheus: + endpoint: "prometheus:8889" + namespace: "default" + + # Data sources: metrics + prometheusremotewrite: + endpoint: "http://some.url:9411/api/prom/push" + + # Data sources: traces + zipkin: + endpoint: "http://localhost:9411/api/v2/spans" +``` + +## Extensions + +Extensions are available primarily for tasks that do not involve processing telemetry +data. Examples of extensions include health monitoring, service discovery, and +data forwarding. Extensions are optional. + +The `extensions:` section is how extensions are configured. Many extensions +come with default settings so simply specifying the name of the extension is +enough to configure it (for example, `health_check:`). If configuration is +required or a user wants to change the default configuration then such +configuration must be defined in this section. Configuration parameters +specified for which the extension provides a default configuration are +overridden. + +> Configuring an extension does not enable it. Extensions are enabled within +> the [service](#service) section. + +By default, no extensions are configured. A basic example of all available +extensions is provided below. + +> For detailed extension configuration, please see the [extension +README.md](https://github.com/open-telemetry/opentelemetry-collector/blob/main/extension/README.md). + +```yaml +extensions: + health_check: + pprof: + zpages: +``` + +## Service + +The service section is used to configure what components are enabled in the +Collector based on the configuration found in the receivers, processors, +exporters, and extensions sections. If a component is configured, but not +defined within the service section then it is not enabled. The service section +consists of two sub-sections: + +- extensions +- pipelines + +Extensions consist of a list of all extensions to enable. For example: + +```yaml + service: + extensions: [health_check, pprof, zpages] +``` + +Pipelines can be of the following types: + +- traces: collects and processes trace data. +- metrics: collects and processes metric data. +- logs: collects and processes log data. + +A pipeline consists of a set of receivers, processors and exporters. Each +receiver/processor/exporter must be defined in the configuration outside of the +service section to be included in a pipeline. + +*Note:* Each receiver/processor/exporter can be used in more than one pipeline. +For processor(s) referenced in multiple pipelines, each pipeline will get a +separate instance of that processor(s). This is in contrast to +receiver(s)/exporter(s) referenced in multiple pipelines, where only one +instance of a receiver/exporter is used for all pipelines. Also note that the +order of processors dictates the order in which data is processed. + +The following is an example pipeline configuration: + +```yaml +service: + pipelines: + metrics: + receivers: [opencensus, prometheus] + exporters: [opencensus, prometheus] + traces: + receivers: [opencensus, jaeger] + processors: [batch] + exporters: [opencensus, zipkin] +``` + +## Other Information + +### Configuration Environment Variables + +The use and expansion of environment variables is supported in the Collector +configuration. For example: + +```yaml +processors: + attributes/example: + actions: + - key: "${DB_KEY}" + action: "${OPERATION}" +``` + +### Proxy Support + +Exporters that leverage the net/http package (all do today) respect the +following proxy environment variables: + +- HTTP_PROXY +- HTTPS_PROXY +- NO_PROXY + +If set at Collector start time then exporters, regardless of protocol, will or +will not proxy traffic as defined by these environment variables. diff --git a/website_docs/getting-started.md b/website_docs/getting-started.md new file mode 100644 index 00000000000..c687a206b51 --- /dev/null +++ b/website_docs/getting-started.md @@ -0,0 +1,156 @@ +--- +title: "Getting Started" +weight: 1 +--- + +Please be sure to review the [Data Collection +documentation](../../concepts/data-collection) in order to understand the +deployment models, components, and repositories applicable to the OpenTelemetry +Collector. + +## Deployment + +The OpenTelemetry Collector consists of a single binary and two primary deployment methods: + +- **Agent:** A Collector instance running with the application or on the same + host as the application (e.g. binary, sidecar, or daemonset). +- **Gateway:** One or more Collector instances running as a standalone service + (e.g. container or deployment) typically per cluster, datacenter or region. + +### Agent + +It is recommended to deploy the Agent on every host within an environment. In +doing so, the Agent is capable of receiving telemetry data (push and pull +based) as well as enhancing telemetry data with metadata such as custom tags or +infrastructure information. In addition, the Agent can offload responsibilities +that client instrumentation would otherwise need to handle including batching, +retry, encryption, compression and more. OpenTelemetry instrumentation +libraries by default export their data assuming a locally running Collector is +available. + +### Gateway + +Additionally, a Gateway cluster can be deployed in every cluster, datacenter, +or region. A Gateway cluster runs as a standalone service and can offer +advanced capabilities over the Agent including tail-based sampling. In +addition, a Gateway cluster can limit the number of egress points required to +send data as well as consolidate API token management. Each Collector instance +in a Gateway cluster operates independently so it is easy to scale the +architecture based on performance needs with a simple load balancer. If a +gateway cluster is deployed, it usually receives data from Agents deployed +within an environment. + +## Getting Started + +### Demo + +Deploys a load generator, agent and gateway as well as Jaeger, Zipkin and +Prometheus back-ends. More information can be found on the demo +[README.md](https://github.com/open-telemetry/opentelemetry-collector/tree/main/examples/demo) + +```bash +$ git clone git@github.com:open-telemetry/opentelemetry-collector.git; \ + cd opentelemetry-collector/examples/demo; \ + docker-compose up -d +``` + +### Docker + +Every release of the Collector is published to Docker Hub and comes with a +default configuration file. + +```bash +$ docker run otel/opentelemetry-collector +``` + +In addition, you can use the local example provided. This example starts a +Docker container of the +[core](https://github.com/open-telemetry/opentelemetry-collector) version of +the Collector with all receivers enabled and exports all the data it receives +locally to a file. Data is sent to the container and the container scrapes its +own Prometheus metrics. + +```bash +$ git clone git@github.com:open-telemetry/opentelemetry-collector.git; \ + cd opentelemetry-collector/examples; \ + go build main.go; ./main & pid1="$!"; + docker run --rm -p 13133:13133 -p 14250:14250 -p 14268:14268 \ + -p 55678-55679:55678-55679 -p 4317:4317 -p 8888:8888 -p 9411:9411 \ + -v "${PWD}/otel-local-config.yaml":/otel-local-config.yaml \ + --name otelcol otel/opentelemetry-collector \ + --config otel-local-config.yaml; \ + kill $pid1; docker stop otelcol +``` + +### Kubernetes + +Deploys an agent as a daemonset and a single gateway instance. + +```bash +$ kubectl apply -f https://raw.githubusercontent.com/open-telemetry/opentelemetry-collector/main/examples/k8s/otel-config.yaml +``` + +The example above is meant to serve as a starting point, to be extended and +customized before actual production usage. + +The [OpenTelemetry +Operator](https://github.com/open-telemetry/opentelemetry-operator) can also be +used to provision and maintain an OpenTelemetry Collector instance, with +features such as automatic upgrade handling, `Service` configuration based on +the OpenTelemetry configuration, automatic sidecar injection into deployments, +among others. + +### Linux Packaging + +Every Collector release includes DEB and RPM packaging for Linux amd64/arm64 +systems. The packaging includes a default configuration that can be found at +`/etc/otel-collector/config.yaml` post-installation. + +> Please note that systemd is require for automatic service configuration + +To get started on Debian systems run the following replacing `v0.20.0` with the +version of the Collector you wish to run and `amd64` with the appropriate +architecture. + +```bash +$ sudo apt-get update +$ sudo apt-get -y install wget systemctl +$ wget https://github.com/open-telemetry/opentelemetry-collector/releases/download/v0.20.0/otel-collector_0.20.0_amd64.deb +$ dpkg -i otel-collector_0.20.0_amd64.deb +``` + +To get started on Red Hat systems run the following replacing `v0.20.0` with the +version of the Collector you wish to run and `x86_64` with the appropriate +architecture. + +```bash +$ sudo yum update +$ sudo yum -y install wget systemctl +$ wget https://github.com/open-telemetry/opentelemetry-collector/releases/download/v0.20.0/otel-collector_0.20.0-1_x86_64.rpm +$ rpm -ivh otel-collector_0.20.0-1_x86_64.rpm +``` + +### Windows Packaging + +Every Collector release includes EXE and MSI packaging for Linux amd64 systems. +The MSI packaging includes a default configuration that can be found at +`\Program Files\OpenTelemetry Collector\config.yaml`. + +> Please note the Collector service is not automatically started + +The easiest way to get started is to double-click the MSI package and follow +the wizard. Silent installation is also available. + +### Local + +Builds the latest version of the collector based on the local operating system, +runs the binary with all receivers enabled and exports all the data it receives +locally to a file. Data is sent to the container and the container scrapes its own +Prometheus metrics. + +```bash +$ git clone git@github.com:open-telemetry/opentelemetry-collector.git; \ + cd opentelemetry-collector; make install-tools; make otelcol; \ + go build examples/demo/app/main.go; ./main & pid1="$!"; \ + ./bin/otelcol_$(go env GOOS)_$(go env GOARCH) --config ./examples/local/otel-config.yaml; kill $pid1 +```