This repository intends to provide useful templates and examples of various OpenTelemetry hosting, security implementation patterns within such core AWS services as ECS (Elastic Container Service), EC2 (Elastic Cloud Compute), and Lambda.
The templates and examples provide a snapshot of functionality at a specific moment in time (late August 2021) and are not guaranteed to function in perpetuity.
These templates and examples are not currently intended for production use (please see Caveats and Assumptions later in this document for additional details).
OpenTelemetry is a vendor-agnostic observability framework for instrumenting, generating, collecting, and exporting telemetry data. The OpenTelemetry Protocol (OTLP) is vendor neutral. This allows you to send telemetry to multiple backends or change backends entirely—all without rewriting your code.
OpenTelemetry collectors include both Agents, a collector instance running with the application or on the same host as a sidecar or daemonset, and a Gateway, a standalone service deployed once per data center or region.
The Agent collectors enables applications to offload responsibilities including batching, retry, encryption, and more. This Agent can also enhance telemetry data with metadata such as custom tags or infrastructure information. This Agent pattern frequently simplifies the client implementation of the OpenTelemetry instrumentation.
The Gateway collectors run as a standalone service and can offer advanced capabilities that include tail-based sampling. A Gateway collector can limit the number of egress points required to send data and consolidate API token management. If a gateway cluster is deployed, it usually receives data from Agent collectors deployed within an environment.
Enterprise vendors who support OpenTelemetry Protocol (OTLP) include AWS, Datadog, Dynatrace, HoneyComb, and New Relic (among others).
Over time, the OpenTelemetry protocol will provide support for telemetry data like traces, metrics, and logs. Currently, only tracing has been released as a generally available, production quality release. Check back on the OpenTelemetry component status for updates on the development lifecycle for other telemetry data, e.g., metrics and logs.
These examples assumes you have at least the following tools/services installed and are somewhat fluent in their use:
You may optionally use K6 to generate load for your services once they exist. See load test documentation for additional details.
The examples folder has some code samples to help familiarize yourself with some core OpenTelemetry concepts on your local machine before developing, deploying, and instrumenting more complex applications in AWS.
The architecture diagram below shows the hosting pattern for OpenTelemetry collectors and our sample application. The available routes to users of our sample service are displayed in magenta while the flow of telemetry data is shown in green.
The bottom right corner of the architecture diagram shows the available Gateway collector exporters for this sample implementation. AWS X-ray is available by default while Honeycomb, Lightstep, and New Relic are all available via optional configuration.
Providing an API key for each of the vendors as an input variable to your Gateway infrastructure will automatically enable the exporter to that vendor back end.
An initial version of this example repo included a self-hosted authentication server to be used in conjunction with the bearertokenauth
. However, support for this extension was not available in the AWS Distro for OpenTelemetry [ADOT] across all the referenced services (ECS, EC2 and Lambda).
The example authentication flow is now included in a separate, public repository. Alternatively, you can reference the auth tag for this repository instead.
Please see the infrastructure README for additional guidance on deploying the infrastructure used in this demo (includes both the Gateway and sample application).
The sample application exposes the magenta routes from our architecture diagram above and create the necessary OpenTelemetry Agent collector to export data to our Gateway. The Gateway infrastructure creates the OpenTelemetry Gateway collector to receive data from agents and export to one or more of the configured backends.
After packaging and deploying your applications, you can start to generate trace data and visualize it via the configured backend.
To generate trace data, navigate to the demo site landing page, typically demo.your-domain.com
. This value is also output value from the application infrastructure deployment and click on any of the displayed links.
Each of the links will make a request to the /proxy
service that forwards requests to the appropriate service.
Note: the proxy service exports data directly to the OpenTelemetry collector over HTTP using standard HTTP request libraries (see the LightStep guide for other common use cases that might fit this pattern).
This application is for demonstration purposes only and not intended as production-quality infrastructure and application code. As a result, some assumptions have been made about your infrastructure and environment.
Please note that your experience deploying this infrastructure may differ if some of these assumptions are not true for your environment.
- This infrastructure will be deployed into the VPC of single AWS region
- A single VPC can be identified using one or more of the provided VPC filter criteria
- Public and private subnets that belong to the VPC above can be identified using one or more provided subnet filter criteria
- An Elastic Container Repository (ECR) already exists for your images
- A domain and Route53 hosted zone exist that will serve as the main domain for the subdomains created by this project
The caveats noted below attempt to describe some of the technical issues that limit running this service as a production-quality implementation.
- Some secrets such as the HoneyComb write key appear in plain-text in both the AWS SSM Parameter Store and the Terraform state file
- Other than secure communication offer HTTPS, this example implementation provides minimal direction on securing your OpenTelemetry collector (see Security)