From 57178d718f6355ade0f5901a8b8a30c4e89d036c Mon Sep 17 00:00:00 2001 From: James Bebbington Date: Wed, 3 Jun 2020 22:08:57 +1000 Subject: [PATCH 1/4] Added auto resource detection proposal --- text/0111-auto-resource-detection.md | 115 +++++++++++++++++++++++++++ 1 file changed, 115 insertions(+) create mode 100644 text/0111-auto-resource-detection.md diff --git a/text/0111-auto-resource-detection.md b/text/0111-auto-resource-detection.md new file mode 100644 index 000000000..46188cc50 --- /dev/null +++ b/text/0111-auto-resource-detection.md @@ -0,0 +1,115 @@ +# Automatic Resource Detection + +Describes a mechanism to support auto-detection of resource information. + +## Motivation + +Resource information, i.e. attributes associated with the entity producing telemetry, can currently be supplied to tracer and meter providers or appended in custom exporters. In addition to this, it would be useful to have a mechanism to automatically detect resource information from the host (e.g. from an environment variable or from aws, gcp, etc metadata) and apply this to all kinds of telemetry. This will in many cases prevent users from having to manually configure resource information. + +Note there are some existing implementations of this already in the SDKs (see [below](#prior-art-and-alternatives)), but nothing currently in the specification. + +## Explanation + +In order to apply auto-detected resource information to all kinds of telemetry, a user will need to: + +- Configure which resource detector(s) they would like to run (e.g. AWS EC2 detector) +- Provide the resource information returned by the configured detector(s) to the relevant tracer or meter provider(s) + +Note this means that if the user wants to make use of auto-detected resource information, they will be required to explicitly pass the resource to the trace and/or metric provider(s). + +If multiple detectors are configured, and more than one of these successfully detects a resource, the resources will be merged according to the Merge interface already defined in the specification, i.e. the earliest matched resource's attributes will take precedence. Each detector may be run in parallel, but to ensure deterministic results, the resources must be merged in the order the detectors were added. + +A default implementation of a detector that reads resource data from the `OTEL_RESOURCE` environment variable will be included in the SDK. The environment variable will contain of a list of key value pairs, and these are expected to be represented in a format similar to the [W3C Correlation-Context](https://github.com/w3c/correlation-context/blob/master/correlation_context/HTTP_HEADER_FORMAT.md#header-value), i.e.: `key1=value1,key2=value2`. This detector must always be configured as the first detector by default. + +Custom resource detectors related to specific environments (e.g. specific cloud vendors) must be implemented outside of the SDK, and users will need to import these separately. + +In order to allow users to be able to use different detectors for different trace or meter providers, the detector(s) must be configured against a resource provider, and users can create multiple providers if desired. + +## Internal details + +As described above, the following will be added to the Resource SDK specification: + +- An interface for "detectors", to retrieve resource information +- Specification for the resource provider, which can have custom detectors added, and will return a merged resource +- Details of the default "From Environment Variable" detector implementation described above + +This is a relatively small proposal so is easiest to explain the details with a code example: + +### Usage + +The following example in Go creates a trace provider that uses resource information automatically detected from AWS or GCP: + +Assumes a dependency has been added on the `otel api`, `otel sdk`, `otel awsdetector`, and `otel gcpdetector` packages. + +```go +rp := resource.NewMustProvider() // or NewProvider (see below) +rp.AddDetectors(awsdetector.EC2, gcpdetector.GCE, gcpdetector.GKE) +resource := rp.AutoDetect(ctx) +tp := sdktrace.NewProvider(sdktrace.WithResource(resource)) +``` + +Or more simply: + +```go +tp := sdktrace.NewProvider(sdktrace.MustDetectResource(awsdetector.EC2, gcpdetector.GCE, gcpdetector.GKE)) // or DetectResource (see below) +``` + +In the case of both `WithResource` & `DetectResource` being supplied, the detected resource will be merged with the supplied resource (with the supplied resource taking precedence). + +### Components + +#### Detector + +The Detector interface will simply return a Resource: + +```go +type Detector interface { + Detect(ctx context.Context) (*Resource, error) +} +``` + +If a detector is not able to detect a resource, it must return an uninitialized resource such that the result of each call to `Detect` can be merged. + +#### Provider + +A Provider will have a function to add detectors & to retrieve the auto-detected reosurce information: + +```go +type Provider struct { + AddDetectors(detectors ...Detector) { ... } + AutoDetect(ctx context.Context) (*Resource, error) { ... } +} +``` + +### Error Handling + +In the case of one or more detectors raising an error, there are two reasonable options: + +1. Ignore that detector, and continue with a warning (likely meaning we will continue without expected resource information) +2. Crash the application (raise a panic) + +These options will be provided as separate interfaces to let the user decide how to recover from failure, e.g. `Provider` & `ProviderMust` + +## Trade-offs and mitigations + +- The resource provider adds a small amount of complexity that may not be necessary if we can't think of any use cases where a user would want to configure different sets of detectors, but omitting it enforces that restriction. We could add a `global.SetResourceProvider(rp)` function similar to the convention for metrics & traces, but there isn't much value in doing this as the resource provider would only be used during initialization. +- In this proposal, no resource detection will happen by default. It may be preferable to make trace / metric providers use the default resource provider if no resource is supplied. This would be in line with the behaviour of the current Java implementation - see [below](#prior-art-and-alternatives) +- In the case of an error at resource detection time, another alternative would be to start a background thread to retry following some strategy, but it's not clear that there would be much value in doing this, and it would add considerable unnecessary complexity. + +## Prior art and alternatives + +This proposal is largely inspired by the existing OpenCensus specification, the OpenCensus Go implementation, and the OpenTelemetry JS implementation. For reference, see the relevant section of the [OpenCensus specification](https://github.com/census-instrumentation/opencensus-specs/blob/master/resource/Resource.md#populating-resources) + +### Existing OpenTelemetry implementations + +- Resource detection implementation in JS SDK [here](https://github.com/open-telemetry/opentelemetry-js/tree/master/packages/opentelemetry-resources): The JS implementation is very similar to this proposal. This proposal adds a resource provider instead of just having a global `DetectResources` function. In addition, vendor specific resource detection code is currently in the resource package, so this would need to be separated. +- Environment variable resource detection in Java SDK [here](https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/java/io/opentelemetry/sdk/resources/EnvVarResource.java): This implementation does not currently include a detector interface, and this detector is used by default for trace and meter providers (so this proposal would introduce a breaking change in its current form) + +## Open questions + +- Does this interfere with any other upcoming specification changes related to resources? +- If custom detectors need to live outside the core repo, what is the expectation regarding where they should be hosted? + +## Future possibilities + +When the Collector is run as an agent, the same interface, shared with the Go SDK, could be used to append resource information detected from the host to all kinds of telemetry in a Processor (probably as an extension to the existing Resource Processor). This would require a translation from the SDK resource to the collector's internal representation of a resource. From a8291297225965b12153422a22f54e245cec7a61 Mon Sep 17 00:00:00 2001 From: James Bebbington Date: Mon, 8 Jun 2020 14:52:52 +1000 Subject: [PATCH 2/4] Removed resource provider concept as per review comments --- text/0111-auto-resource-detection.md | 55 ++++++++-------------------- 1 file changed, 16 insertions(+), 39 deletions(-) diff --git a/text/0111-auto-resource-detection.md b/text/0111-auto-resource-detection.md index 46188cc50..b70205086 100644 --- a/text/0111-auto-resource-detection.md +++ b/text/0111-auto-resource-detection.md @@ -10,52 +10,36 @@ Note there are some existing implementations of this already in the SDKs (see [b ## Explanation -In order to apply auto-detected resource information to all kinds of telemetry, a user will need to: +In order to apply auto-detected resource information to all kinds of telemetry, a user will need to configure which resource detector(s) they would like to run (e.g. AWS EC2 detector). These will be configured for each tracer and meter provider. -- Configure which resource detector(s) they would like to run (e.g. AWS EC2 detector) -- Provide the resource information returned by the configured detector(s) to the relevant tracer or meter provider(s) +If multiple detectors are configured, and more than one of these successfully detects a resource, the resources will be merged according to the Merge interface already defined in the specification, i.e. the earliest matched resource's attributes will take precedence. Each detector may be run in parallel, but to ensure deterministic results, the resources must be merged in the order the detectors were added. In the case of the user manually supplying resource attributes in addition to resource(s) being detected, the detected resource will be merged with the supplied resource, with the supplied resource taking precedence. -Note this means that if the user wants to make use of auto-detected resource information, they will be required to explicitly pass the resource to the trace and/or metric provider(s). - -If multiple detectors are configured, and more than one of these successfully detects a resource, the resources will be merged according to the Merge interface already defined in the specification, i.e. the earliest matched resource's attributes will take precedence. Each detector may be run in parallel, but to ensure deterministic results, the resources must be merged in the order the detectors were added. - -A default implementation of a detector that reads resource data from the `OTEL_RESOURCE` environment variable will be included in the SDK. The environment variable will contain of a list of key value pairs, and these are expected to be represented in a format similar to the [W3C Correlation-Context](https://github.com/w3c/correlation-context/blob/master/correlation_context/HTTP_HEADER_FORMAT.md#header-value), i.e.: `key1=value1,key2=value2`. This detector must always be configured as the first detector by default. +A default implementation of a detector that reads resource data from the `OTEL_RESOURCE` environment variable will be included in the SDK. The environment variable will contain of a list of key value pairs, and these are expected to be represented in a format similar to the [W3C Correlation-Context](https://github.com/w3c/correlation-context/blob/master/correlation_context/HTTP_HEADER_FORMAT.md#header-value), i.e.: `key1=value1,key2=value2`. This detector must always be configured as the first detector and will always be run by default. Custom resource detectors related to specific environments (e.g. specific cloud vendors) must be implemented outside of the SDK, and users will need to import these separately. -In order to allow users to be able to use different detectors for different trace or meter providers, the detector(s) must be configured against a resource provider, and users can create multiple providers if desired. - ## Internal details As described above, the following will be added to the Resource SDK specification: -- An interface for "detectors", to retrieve resource information -- Specification for the resource provider, which can have custom detectors added, and will return a merged resource -- Details of the default "From Environment Variable" detector implementation described above +- An interface for "detectors", to retrieve resource information, that can be supplied to tracer and meter providers +- Specification for how to merge resources returned by the configured detectors, and with a manually supplied resource as described above +- Details of the default "From Environment Variable" detector implementation as described above This is a relatively small proposal so is easiest to explain the details with a code example: ### Usage -The following example in Go creates a trace provider that uses resource information automatically detected from AWS or GCP: +The following example in Go creates a tracer and meter provider that uses resource information automatically detected from AWS or GCP: -Assumes a dependency has been added on the `otel api`, `otel sdk`, `otel awsdetector`, and `otel gcpdetector` packages. +Assumes a dependency has been added on the `otel/api`, `otel/sdk`, `otel/awsdetector`, and `otel/gcpdetector` packages. ```go -rp := resource.NewMustProvider() // or NewProvider (see below) -rp.AddDetectors(awsdetector.EC2, gcpdetector.GCE, gcpdetector.GKE) -resource := rp.AutoDetect(ctx) -tp := sdktrace.NewProvider(sdktrace.WithResource(resource)) +resources := resource.Detector[]{awsdetector.EC2, gcpdetector.GCE, gcpdetector.GKE} +tp := sdktrace.NewProvider(sdktrace.MustDetectResource(resources)) // or DetectResource (see below) +mp := push.New(..., push.MustDetectResource(resources)) ``` -Or more simply: - -```go -tp := sdktrace.NewProvider(sdktrace.MustDetectResource(awsdetector.EC2, gcpdetector.GCE, gcpdetector.GKE)) // or DetectResource (see below) -``` - -In the case of both `WithResource` & `DetectResource` being supplied, the detected resource will be merged with the supplied resource (with the supplied resource taking precedence). - ### Components #### Detector @@ -72,14 +56,9 @@ If a detector is not able to detect a resource, it must return an uninitialized #### Provider -A Provider will have a function to add detectors & to retrieve the auto-detected reosurce information: +In addition to supplying a way to associate a Resource with a tracer or meter provider, i.e. `WithResource`, the SDK will supply a way to associate a set of Detectors with a tracer or meter provider, i.e. `DetectResource`. -```go -type Provider struct { - AddDetectors(detectors ...Detector) { ... } - AutoDetect(ctx context.Context) (*Resource, error) { ... } -} -``` +Because the same detectors will be used across different providers, if detection is not relatively trivial, the results should be cached inside the detector. ### Error Handling @@ -88,12 +67,10 @@ In the case of one or more detectors raising an error, there are two reasonable 1. Ignore that detector, and continue with a warning (likely meaning we will continue without expected resource information) 2. Crash the application (raise a panic) -These options will be provided as separate interfaces to let the user decide how to recover from failure, e.g. `Provider` & `ProviderMust` +These options will be provided as separate interfaces to let the user decide how to recover from failure, e.g. `DetectResource` & `MustDetectResource` ## Trade-offs and mitigations -- The resource provider adds a small amount of complexity that may not be necessary if we can't think of any use cases where a user would want to configure different sets of detectors, but omitting it enforces that restriction. We could add a `global.SetResourceProvider(rp)` function similar to the convention for metrics & traces, but there isn't much value in doing this as the resource provider would only be used during initialization. -- In this proposal, no resource detection will happen by default. It may be preferable to make trace / metric providers use the default resource provider if no resource is supplied. This would be in line with the behaviour of the current Java implementation - see [below](#prior-art-and-alternatives) - In the case of an error at resource detection time, another alternative would be to start a background thread to retry following some strategy, but it's not clear that there would be much value in doing this, and it would add considerable unnecessary complexity. ## Prior art and alternatives @@ -102,8 +79,8 @@ This proposal is largely inspired by the existing OpenCensus specification, the ### Existing OpenTelemetry implementations -- Resource detection implementation in JS SDK [here](https://github.com/open-telemetry/opentelemetry-js/tree/master/packages/opentelemetry-resources): The JS implementation is very similar to this proposal. This proposal adds a resource provider instead of just having a global `DetectResources` function. In addition, vendor specific resource detection code is currently in the resource package, so this would need to be separated. -- Environment variable resource detection in Java SDK [here](https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/java/io/opentelemetry/sdk/resources/EnvVarResource.java): This implementation does not currently include a detector interface, and this detector is used by default for trace and meter providers (so this proposal would introduce a breaking change in its current form) +- Resource detection implementation in JS SDK [here](https://github.com/open-telemetry/opentelemetry-js/tree/master/packages/opentelemetry-resources): The JS implementation is very similar to this proposal. This proposal states that the SDK will allow detectors to be passed into telemetry providers directly instead of just having a global `DetectResources` function which the user will need to call and pass in explicitly. In addition, vendor specific resource detection code is currently in the JS resource package, so this would need to be separated. +- Environment variable resource detection in Java SDK [here](https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/java/io/opentelemetry/sdk/resources/EnvVarResource.java): This implementation does not currently include a detector interface, but is used by default for tracer and meter providers ## Open questions From 18155e69edef07a434ea0dcb33ba52555802f683 Mon Sep 17 00:00:00 2001 From: James Bebbington Date: Tue, 30 Jun 2020 19:18:46 +1000 Subject: [PATCH 3/4] Changed the proposal back to separating resource detection from the tracer/meter providers, clarified default resource detection in more detail, and added more points to the trade-offs & mitigations section --- text/0111-auto-resource-detection.md | 41 +++++++++++++++++----------- 1 file changed, 25 insertions(+), 16 deletions(-) diff --git a/text/0111-auto-resource-detection.md b/text/0111-auto-resource-detection.md index b70205086..a4764670b 100644 --- a/text/0111-auto-resource-detection.md +++ b/text/0111-auto-resource-detection.md @@ -10,23 +10,24 @@ Note there are some existing implementations of this already in the SDKs (see [b ## Explanation -In order to apply auto-detected resource information to all kinds of telemetry, a user will need to configure which resource detector(s) they would like to run (e.g. AWS EC2 detector). These will be configured for each tracer and meter provider. +In order to apply auto-detected resource information to all kinds of telemetry, a user will need to configure which resource detector(s) they would like to run (e.g. AWS EC2 detector). -If multiple detectors are configured, and more than one of these successfully detects a resource, the resources will be merged according to the Merge interface already defined in the specification, i.e. the earliest matched resource's attributes will take precedence. Each detector may be run in parallel, but to ensure deterministic results, the resources must be merged in the order the detectors were added. In the case of the user manually supplying resource attributes in addition to resource(s) being detected, the detected resource will be merged with the supplied resource, with the supplied resource taking precedence. +If multiple detectors are configured, and more than one of these successfully detects a resource, the resources will be merged according to the Merge interface already defined in the specification, i.e. the earliest matched resource's attributes will take precedence. Each detector may be run in parallel, but to ensure deterministic results, the resources must be merged in the order the detectors were added. -A default implementation of a detector that reads resource data from the `OTEL_RESOURCE` environment variable will be included in the SDK. The environment variable will contain of a list of key value pairs, and these are expected to be represented in a format similar to the [W3C Correlation-Context](https://github.com/w3c/correlation-context/blob/master/correlation_context/HTTP_HEADER_FORMAT.md#header-value), i.e.: `key1=value1,key2=value2`. This detector must always be configured as the first detector and will always be run by default. +A default implementation of a detector that reads resource data from the `OTEL_RESOURCE` environment variable will be included in the SDK. The environment variable will contain of a list of key value pairs, and these are expected to be represented in a format similar to the [W3C Correlation-Context](https://github.com/w3c/correlation-context/blob/master/correlation_context/HTTP_HEADER_FORMAT.md#header-value), except that additional semi-colon delimited metadata is not supported, i.e.: `key1=value1,key2=value2`. If the user does not specify any resource, this detector will be run by default. -Custom resource detectors related to specific environments (e.g. specific cloud vendors) must be implemented outside of the SDK, and users will need to import these separately. +Custom resource detectors related to specific environments (e.g. specific cloud vendors) must be implemented as packages separate to the core SDK, and users will need to import these separately. ## Internal details As described above, the following will be added to the Resource SDK specification: -- An interface for "detectors", to retrieve resource information, that can be supplied to tracer and meter providers -- Specification for how to merge resources returned by the configured detectors, and with a manually supplied resource as described above -- Details of the default "From Environment Variable" detector implementation as described above +- An interface for "detectors", to retrieve resource information +- Specification for a global function to merge resources returned by a set of detectors +- Details of the "from environment variable" detector implementation as described above +- Specification that default detection (from environment variable) runs once on startup, and is used by all tracer & meter providers by default if no custom resource is supplied -This is a relatively small proposal so is easiest to explain the details with a code example: +This is a relatively small proposal so it is easiest to explain the details with a code example: ### Usage @@ -35,9 +36,9 @@ The following example in Go creates a tracer and meter provider that uses resour Assumes a dependency has been added on the `otel/api`, `otel/sdk`, `otel/awsdetector`, and `otel/gcpdetector` packages. ```go -resources := resource.Detector[]{awsdetector.EC2, gcpdetector.GCE, gcpdetector.GKE} -tp := sdktrace.NewProvider(sdktrace.MustDetectResource(resources)) // or DetectResource (see below) -mp := push.New(..., push.MustDetectResource(resources)) +resource, _ := sdkresource.Detect(ctx, 5 * time.Second, awsdetector.ec2, gcpdetector.gce) +tp := sdktrace.NewProvider(sdktrace.WithResource(resource)) +mp := push.New(..., push.WithResource(resource)) ``` ### Components @@ -52,13 +53,15 @@ type Detector interface { } ``` -If a detector is not able to detect a resource, it must return an uninitialized resource such that the result of each call to `Detect` can be merged. +The detect function should contain a mechanism to timeout and cancel the request. If a detector is not able to detect a resource, it must return an uninitialized resource such that the result of each call to `Detect` can be merged. -#### Provider +#### Global Function -In addition to supplying a way to associate a Resource with a tracer or meter provider, i.e. `WithResource`, the SDK will supply a way to associate a set of Detectors with a tracer or meter provider, i.e. `DetectResource`. +The SDK will provide a `Detect` function. This will take a set of detectors that should be run and merged in order as described in the intro, and return a resource. -Because the same detectors will be used across different providers, if detection is not relatively trivial, the results should be cached inside the detector. +```go +func Detect(ctx context.Context, timeout time.Duration, ...resource.Detector) (*Resource, error) +``` ### Error Handling @@ -67,10 +70,15 @@ In the case of one or more detectors raising an error, there are two reasonable 1. Ignore that detector, and continue with a warning (likely meaning we will continue without expected resource information) 2. Crash the application (raise a panic) -These options will be provided as separate interfaces to let the user decide how to recover from failure, e.g. `DetectResource` & `MustDetectResource` +The user can decide how to recover from failure. ## Trade-offs and mitigations +- This OTEP proposes storing Vendor resource detection packages outside of the SDK. This ensures the SDK is free of vendor specific code. Given the relatively straightforward & minimal amount of code generally needed to perform resource detection, and the relatively small number of cloud providers, we may instead decide its okay for all the resource detection code to live in the SDK directly. + - If we do allow Vendor resource detection packages in the SDK, we presumably need to restrict these to not being able to use non-trivial libraries +- This OTEP proposes only performing environment variable resource detection by default. Given the relatively small number of cloud providers, we may instead decide its okay to run all detectors by default. This raises the question of if any restrictions would need to be put on this, and how we would handle this in the future if the number of Cloud Providers rises. It would be difficult to back out of running these by default as that would lead to a breaking change. +- This OTEP proposes a global function the user calls with the detectors they want to run, and then expects the user to pass these into the providers. An alternative option (that was previously proposed in this OTEP) would be to supply a set of detectors directly to the metric or trace provider instead of, or as an additional option to, a static resource. That would result in marginally simpler setup code where the user doesn't need to call `AutoDetect` themselves. Another advantage of this approach is that its easier to specify default detectors and override these separately to any static resource the user may want to provide. On the downside, this approach adds the complexity of having to deal with the merging the detected resource with a static resource if provided. It also potentially adds a lot of complexity around how to avoid having detectors run multiple times since they will be configured for each provider. Avoiding having to specify detectors for tracer & meter providers is the primary reason for not going with that option in the end. +- The attribute proto now supports arrays & maps. We could support parsing this out of the `OTEL_RESOURCE` environment variable similar to how Correlation Context supports semi colon lists of keys & key-value pairs, but the added complexity is probably not worthwhile implementing unless someone has a good use case for this. - In the case of an error at resource detection time, another alternative would be to start a background thread to retry following some strategy, but it's not clear that there would be much value in doing this, and it would add considerable unnecessary complexity. ## Prior art and alternatives @@ -86,6 +94,7 @@ This proposal is largely inspired by the existing OpenCensus specification, the - Does this interfere with any other upcoming specification changes related to resources? - If custom detectors need to live outside the core repo, what is the expectation regarding where they should be hosted? +- Also see the [Trade-offs and mitigations](#trade-offs-and-mitigations) section ## Future possibilities From 7bbfeaea429dd550545fc66878f247bfc5f34b66 Mon Sep 17 00:00:00 2001 From: James Bebbington Date: Wed, 1 Jul 2020 11:24:03 +1000 Subject: [PATCH 4/4] Wrap lines --- text/0111-auto-resource-detection.md | 167 +++++++++++++++++++-------- 1 file changed, 122 insertions(+), 45 deletions(-) diff --git a/text/0111-auto-resource-detection.md b/text/0111-auto-resource-detection.md index a4764670b..fb19deefb 100644 --- a/text/0111-auto-resource-detection.md +++ b/text/0111-auto-resource-detection.md @@ -1,39 +1,68 @@ # Automatic Resource Detection -Describes a mechanism to support auto-detection of resource information. +Introduce a mechanism to support auto-detection of resources. ## Motivation -Resource information, i.e. attributes associated with the entity producing telemetry, can currently be supplied to tracer and meter providers or appended in custom exporters. In addition to this, it would be useful to have a mechanism to automatically detect resource information from the host (e.g. from an environment variable or from aws, gcp, etc metadata) and apply this to all kinds of telemetry. This will in many cases prevent users from having to manually configure resource information. +Resource information, i.e. attributes associated with the entity producing +telemetry, can currently be supplied to tracer and meter providers or appended +in custom exporters. In addition to this, it would be useful to have a mechanism +to automatically detect resource information from the host (e.g. from an +environment variable or from aws, gcp, etc metadata) and apply this to all kinds +of telemetry. This will in many cases prevent users from having to manually +configure resource information. -Note there are some existing implementations of this already in the SDKs (see [below](#prior-art-and-alternatives)), but nothing currently in the specification. +Note there are some existing implementations of this already in the SDKs (see +[below](#prior-art-and-alternatives)), but nothing currently in the +specification. ## Explanation -In order to apply auto-detected resource information to all kinds of telemetry, a user will need to configure which resource detector(s) they would like to run (e.g. AWS EC2 detector). - -If multiple detectors are configured, and more than one of these successfully detects a resource, the resources will be merged according to the Merge interface already defined in the specification, i.e. the earliest matched resource's attributes will take precedence. Each detector may be run in parallel, but to ensure deterministic results, the resources must be merged in the order the detectors were added. - -A default implementation of a detector that reads resource data from the `OTEL_RESOURCE` environment variable will be included in the SDK. The environment variable will contain of a list of key value pairs, and these are expected to be represented in a format similar to the [W3C Correlation-Context](https://github.com/w3c/correlation-context/blob/master/correlation_context/HTTP_HEADER_FORMAT.md#header-value), except that additional semi-colon delimited metadata is not supported, i.e.: `key1=value1,key2=value2`. If the user does not specify any resource, this detector will be run by default. - -Custom resource detectors related to specific environments (e.g. specific cloud vendors) must be implemented as packages separate to the core SDK, and users will need to import these separately. +In order to apply auto-detected resource information to all kinds of telemetry, +a user will need to configure which resource detector(s) they would like to run +(e.g. AWS EC2 detector). + +If multiple detectors are configured, and more than one of these successfully +detects a resource, the resources will be merged according to the Merge +interface already defined in the specification, i.e. the earliest matched +resource's attributes will take precedence. Each detector may be run in +parallel, but to ensure deterministic results, the resources must be merged in +the order the detectors were added. + +A default implementation of a detector that reads resource data from the +`OTEL_RESOURCE` environment variable will be included in the SDK. The +environment variable will contain of a list of key value pairs, and these are +expected to be represented in a format similar to the [W3C +Correlation-Context](https://github.com/w3c/correlation-context/blob/master/correlation_context/HTTP_HEADER_FORMAT.md#header-value), +except that additional semi-colon delimited metadata is not supported, i.e.: +`key1=value1,key2=value2`. If the user does not specify any resource, this +detector will be run by default. + +Custom resource detectors related to specific environments (e.g. specific cloud +vendors) must be implemented as packages separate to the core SDK, and users +will need to import these separately. ## Internal details -As described above, the following will be added to the Resource SDK specification: +As described above, the following will be added to the Resource SDK +specification: - An interface for "detectors", to retrieve resource information -- Specification for a global function to merge resources returned by a set of detectors -- Details of the "from environment variable" detector implementation as described above -- Specification that default detection (from environment variable) runs once on startup, and is used by all tracer & meter providers by default if no custom resource is supplied - -This is a relatively small proposal so it is easiest to explain the details with a code example: +- Specification for a global function to merge resources returned by a set of + detectors +- Details of the "from environment variable" detector implementation as + described above +- Specification that default detection (from environment variable) runs once on + startup, and is used by all tracer & meter providers by default if no custom + resource is supplied ### Usage -The following example in Go creates a tracer and meter provider that uses resource information automatically detected from AWS or GCP: +The following example in Go creates a tracer and meter provider that uses +resource information automatically detected from AWS or GCP: -Assumes a dependency has been added on the `otel/api`, `otel/sdk`, `otel/awsdetector`, and `otel/gcpdetector` packages. +Assumes a dependency has been added on the `otel/api`, `otel/sdk`, +`otel/awsdetector`, and `otel/gcpdetector` packages. ```go resource, _ := sdkresource.Detect(ctx, 5 * time.Second, awsdetector.ec2, gcpdetector.gce) @@ -45,57 +74,105 @@ mp := push.New(..., push.WithResource(resource)) #### Detector -The Detector interface will simply return a Resource: - -```go -type Detector interface { - Detect(ctx context.Context) (*Resource, error) -} -``` +The `Detector` interface will simply contain a `Detect` function that returns a +Resource. -The detect function should contain a mechanism to timeout and cancel the request. If a detector is not able to detect a resource, it must return an uninitialized resource such that the result of each call to `Detect` can be merged. +The `Detect` function should contain a mechanism to timeout and cancel the +operation. If a detector is not able to detect a resource, it must return an +uninitialized resource such that the result of each call to `Detect` can be +merged. #### Global Function -The SDK will provide a `Detect` function. This will take a set of detectors that should be run and merged in order as described in the intro, and return a resource. - -```go -func Detect(ctx context.Context, timeout time.Duration, ...resource.Detector) (*Resource, error) -``` +The SDK will also provide a global `Detect` function. This will take a timeout +duration and a set of detectors that should be run and merged in order as +described in the intro, and return a resource. ### Error Handling -In the case of one or more detectors raising an error, there are two reasonable options: +In the case of one or more detectors raising an error, there are two reasonable +options: -1. Ignore that detector, and continue with a warning (likely meaning we will continue without expected resource information) +1. Ignore that detector, and continue with a warning (likely meaning we will + continue without expected resource information) 2. Crash the application (raise a panic) The user can decide how to recover from failure. ## Trade-offs and mitigations -- This OTEP proposes storing Vendor resource detection packages outside of the SDK. This ensures the SDK is free of vendor specific code. Given the relatively straightforward & minimal amount of code generally needed to perform resource detection, and the relatively small number of cloud providers, we may instead decide its okay for all the resource detection code to live in the SDK directly. - - If we do allow Vendor resource detection packages in the SDK, we presumably need to restrict these to not being able to use non-trivial libraries -- This OTEP proposes only performing environment variable resource detection by default. Given the relatively small number of cloud providers, we may instead decide its okay to run all detectors by default. This raises the question of if any restrictions would need to be put on this, and how we would handle this in the future if the number of Cloud Providers rises. It would be difficult to back out of running these by default as that would lead to a breaking change. -- This OTEP proposes a global function the user calls with the detectors they want to run, and then expects the user to pass these into the providers. An alternative option (that was previously proposed in this OTEP) would be to supply a set of detectors directly to the metric or trace provider instead of, or as an additional option to, a static resource. That would result in marginally simpler setup code where the user doesn't need to call `AutoDetect` themselves. Another advantage of this approach is that its easier to specify default detectors and override these separately to any static resource the user may want to provide. On the downside, this approach adds the complexity of having to deal with the merging the detected resource with a static resource if provided. It also potentially adds a lot of complexity around how to avoid having detectors run multiple times since they will be configured for each provider. Avoiding having to specify detectors for tracer & meter providers is the primary reason for not going with that option in the end. -- The attribute proto now supports arrays & maps. We could support parsing this out of the `OTEL_RESOURCE` environment variable similar to how Correlation Context supports semi colon lists of keys & key-value pairs, but the added complexity is probably not worthwhile implementing unless someone has a good use case for this. -- In the case of an error at resource detection time, another alternative would be to start a background thread to retry following some strategy, but it's not clear that there would be much value in doing this, and it would add considerable unnecessary complexity. +- This OTEP proposes storing Vendor resource detection packages outside of the + SDK. This ensures the SDK is free of vendor specific code. Given the + relatively straightforward & minimal amount of code generally needed to + perform resource detection, and the relatively small number of cloud + providers, we may instead decide its okay for all the resource detection code + to live in the SDK directly. + - If we do allow Vendor resource detection packages in the SDK, we presumably + need to restrict these to not being able to use non-trivial libraries +- This OTEP proposes only performing environment variable resource detection by + default. Given the relatively small number of cloud providers, we may instead + decide its okay to run all detectors by default. This raises the question of + if any restrictions would need to be put on this, and how we would handle this + in the future if the number of Cloud Providers rises. It would be difficult to + back out of running these by default as that would lead to a breaking change. +- This OTEP proposes a global function the user calls with the detectors they + want to run, and then expects the user to pass these into the providers. An + alternative option (that was previously proposed in this OTEP) would be to + supply a set of detectors directly to the metric or trace provider instead of, + or as an additional option to, a static resource. That would result in + marginally simpler setup code where the user doesn't need to call `AutoDetect` + themselves. Another advantage of this approach is that its easier to specify + default detectors and override these separately to any static resource the + user may want to provide. On the downside, this approach adds the complexity + of having to deal with the merging the detected resource with a static + resource if provided. It also potentially adds a lot of complexity around how + to avoid having detectors run multiple times since they will be configured for + each provider. Avoiding having to specify detectors for tracer & meter + providers is the primary reason for not going with that option in the end. +- The attribute proto now supports arrays & maps. We could support parsing this + out of the `OTEL_RESOURCE` environment variable similar to how Correlation + Context supports semi colon lists of keys & key-value pairs, but the added + complexity is probably not worthwhile implementing unless someone has a good + use case for this. +- In the case of an error at resource detection time, another alternative would + be to start a background thread to retry following some strategy, but it's not + clear that there would be much value in doing this, and it would add + considerable unnecessary complexity. ## Prior art and alternatives -This proposal is largely inspired by the existing OpenCensus specification, the OpenCensus Go implementation, and the OpenTelemetry JS implementation. For reference, see the relevant section of the [OpenCensus specification](https://github.com/census-instrumentation/opencensus-specs/blob/master/resource/Resource.md#populating-resources) +This proposal is largely inspired by the existing OpenCensus specification, the +OpenCensus Go implementation, and the OpenTelemetry JS implementation. For +reference, see the relevant section of the [OpenCensus +specification](https://github.com/census-instrumentation/opencensus-specs/blob/master/resource/Resource.md#populating-resources) ### Existing OpenTelemetry implementations -- Resource detection implementation in JS SDK [here](https://github.com/open-telemetry/opentelemetry-js/tree/master/packages/opentelemetry-resources): The JS implementation is very similar to this proposal. This proposal states that the SDK will allow detectors to be passed into telemetry providers directly instead of just having a global `DetectResources` function which the user will need to call and pass in explicitly. In addition, vendor specific resource detection code is currently in the JS resource package, so this would need to be separated. -- Environment variable resource detection in Java SDK [here](https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/java/io/opentelemetry/sdk/resources/EnvVarResource.java): This implementation does not currently include a detector interface, but is used by default for tracer and meter providers +- Resource detection implementation in JS SDK + [here](https://github.com/open-telemetry/opentelemetry-js/tree/master/packages/opentelemetry-resources): + The JS implementation is very similar to this proposal. This proposal states + that the SDK will allow detectors to be passed into telemetry providers + directly instead of just having a global `DetectResources` function which the + user will need to call and pass in explicitly. In addition, vendor specific + resource detection code is currently in the JS resource package, so this would + need to be separated. +- Environment variable resource detection in Java SDK + [here](https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/java/io/opentelemetry/sdk/resources/EnvVarResource.java): + This implementation does not currently include a detector interface, but is + used by default for tracer and meter providers ## Open questions -- Does this interfere with any other upcoming specification changes related to resources? -- If custom detectors need to live outside the core repo, what is the expectation regarding where they should be hosted? +- Does this interfere with any other upcoming specification changes related to + resources? +- If custom detectors need to live outside the core repo, what is the + expectation regarding where they should be hosted? - Also see the [Trade-offs and mitigations](#trade-offs-and-mitigations) section ## Future possibilities -When the Collector is run as an agent, the same interface, shared with the Go SDK, could be used to append resource information detected from the host to all kinds of telemetry in a Processor (probably as an extension to the existing Resource Processor). This would require a translation from the SDK resource to the collector's internal representation of a resource. +When the Collector is run as an agent, the same interface, shared with the Go +SDK, could be used to append resource information detected from the host to all +kinds of telemetry in a Processor (probably as an extension to the existing +Resource Processor). This would require a translation from the SDK resource to +the collector's internal representation of a resource.