Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/transform] Set attribute values using connection context #33288

Open
ptodev opened this issue May 29, 2024 · 19 comments
Open

[processor/transform] Set attribute values using connection context #33288

ptodev opened this issue May 29, 2024 · 19 comments
Assignees
Labels
enhancement New feature or request pkg/ottl processor/transform Transform processor

Comments

@ptodev
Copy link
Contributor

ptodev commented May 29, 2024

Component(s)

processor/transform

Is your feature request related to a problem? Please describe.

The attribute and resource processors have a from_context config argument which can be used to set attribute values using connection context:

  # Key specifies the attribute to act upon.
- key: <key>
  action: {insert, update, upsert}
  # FromContext specifies the context value to use to populate the attribute value. 
  # If the key is prefixed with `metadata.`, the values are searched
  # in the receiver's transport protocol additional information like gRPC Metadata or HTTP Headers. 
  # If the key is prefixed with `auth.`, the values are searched
  # in the authentication information set by the server authenticator. 
  # Refer to the server authenticator's documentation part of your pipeline for more information about which attributes are available.
  # If the key doesn't exist, no action is performed.
  # If the key has multiple values the values will be joined with `;` separator.
  from_context: <other key>

It would be nice to be able to do the same thing with the transform processor.

Describe the solution you'd like

The solution should work with all OTTL contexts, so probably we need to add a new OTTL function?

Describe alternatives you've considered

No response

Additional context

This feature will help reduce the need for attributes and resource processors.

Also, #18643 suggests that the transform processor may replace the attributes and resource processors. Having this feature would be a prerequisite for that.

@ptodev ptodev added enhancement New feature or request needs triage New item requiring triage labels May 29, 2024
@github-actions github-actions bot added the processor/transform Transform processor label May 29, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@TylerHelmuth TylerHelmuth added pkg/ottl and removed needs triage New item requiring triage labels May 29, 2024
Copy link
Contributor

Pinging code owners for pkg/ottl: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley. See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 29, 2024
@odubajDT
Copy link
Contributor

odubajDT commented Aug 1, 2024

Hey @TylerHelmuth I would like to look at this issue

@TylerHelmuth
Copy link
Member

@odubajDT @evan-bradley I'd like to talk about this more. We don't have syntax or a specific transformcontext to represent how to extract something from the connection context. I'd like to see some proposals on how we'd implement this before making a PR.

@evan-bradley
Copy link
Contributor

evan-bradley commented Aug 7, 2024

Could we have that discussion on a draft PR? I would prefer to look at something concrete when considering possible designs. I think the high-level design is fairly straightforward (get a value from the context through a path/converter), and the implementation is what's under consideration.

@evan-bradley
Copy link
Contributor

As an initial proposal, we'll want a ctx/context path that is required to be keyed by the path parser and calls (context.Context).Value with the key, and returns the return value (an any-typed value or nil).

As for the transform context, the most conventional way to do this would be to add context.Context as the first parameter to each NewTransformContext function, but this is something we can explore in a concrete implementation. Once the transform context has the context.Context reference, it's just a matter of retrieving it through a path.

If the implementation shows flaws, we can probably pull this off using a converter that relies on an interface implemented by the transform context, but I'd rather save this as a plan B.

@TylerHelmuth
Copy link
Member

We keep breaking NewTransformContext functions as we need access to more data. Should we add an Option pattern?

@evan-bradley
Copy link
Contributor

I think for now it's conceptually simpler for all data to be required. I think we will always have a context.Context object that can be used and can avoid the situation where certain context paths don't work in certain components.

In regards to breaking NewTransformContext, as non-ergonomic as it is from a function signature perspective, if we want everything to be required I'd rather break this at compilation time. The alternative is we implement a config struct or options pattern that we have to validate at runtime for containing the correct data, which would place the burden on users instead of component authors if something is required but not present.

@TylerHelmuth
Copy link
Member

Ok lets try with just breaking it.

Are there any security concerns here? It seems likely you could do something like merge_maps(attributes, ctx, "upsert"), and expose keys.

@evan-bradley
Copy link
Contributor

I dug into it and there's no way to iterate over context.Context without reflection. The path parser will need to require that you key the ctx path when using it. I think that should protect us from situations like the one you described, since if you ask for a key, it's your responsibility to know what's in it.

The one tricky thing is that any complex key/value types will be basically impossible to do without custom OTTL extensions. I'm not sure how we properly address this situation:

ctx := context.WithValue(ctx, StructKey{key: "key"}, CustomStructData{data: contextData})

Keying this would be impossible with OTTL, and even if you use e.g. a string key here instead, using the data is impossible unless you can e.g. convert the type to a map. I think this will just have to be a known caveat.

@TylerHelmuth
Copy link
Member

Requiring an index should be enforceable by the ottl context path function. I agree we should only support string keys.

@evan-bradley
Copy link
Contributor

I agree we should only support string keys.

We don't have to enforce this, (context.Context).Value takes an any type as a key and we can just pass a user-supplied value directly and return whatever Value returns (nil or the value). I was just pointing out that non-primitive typed keys won't work out-of-the-box.

Thinking on it, users could add Converters to provide support for complex key types, e.g. ctx[KeyFunc("key")] where KeyFunc returns whatever type they use to key context values. So I think we should be good to go here.

@TylerHelmuth
Copy link
Member

In that case, I think we should be extremely strict about NOT adding any KeyFunc converters to pkg/ottlfuncs.

@odubajDT
Copy link
Contributor

odubajDT commented Sep 4, 2024

One additional question here, if I understand the implementation of attributes processor correctly, there is the possibility to extract context data only from the following context keys:

- "metadata."
- "auth."
- "client.address"

What in this case means connection context? Do we want to duplicate the exact same functionality or something else as well? Thanks in advance!

@evan-bradley
Copy link
Contributor

What in this case means connection context?

The Collector passes a context.Context object down the pipeline alongside a payload, and usually receivers will include information about the network connection inside of this object.

Do we want to duplicate the exact same functionality or something else as well?

I think the fact that we have the ability to spread operations over multiple statements means that we don't have to special-case this kind of addressing. For example, we could have the following: ctx["auth"]["api-token"] instead of needing to write auth.api-token like it would be in the attributes processor.

Copy link
Contributor

github-actions bot commented Nov 4, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Nov 15, 2024

@djaglowski and I talked a bit about this at kubecon. We think we can get away with including the context.Context in the TransformContext and then adding specific paths to each ottl context to support accessing specific things in the context.Context. Something like request.http to get the map off incoming request headers.

@djaglowski
Copy link
Member

djaglowski commented Dec 2, 2024

One additional detail which @TylerHelmuth and I discussed about how OTTL paths could refer to contexts:

As mentioned elsewhere in the thread, context values can be added in several different ways and each way requires a different access pattern in code. I recently added support in the routing connector to route based on either grpc or http request context, so there is an example here where we fetch values from the context using two different mechanisms. At the same time, it shows how it may be useful to automatically check all known mechanisms for the value, since data may be received in various ways but then sent to the same component for processing.

Based on these use cases, the idea we came up with that each known mechanism for accessing values in the context would have its own Getter implementation, but that a top-level mechanism would also be available, which would search through all known mechanisms in some pre-defined order. e.g.

request.http["foo"] # gets value of "foo" from go.opentelemetry.io/collector/client.Metadata
request.grpc["foo"] # gets value of "foo" from google.golang.org/grpc/metadata
request["foo"] # gets value of "foo" from google.golang.org/grpc/metadata, but if not found there, then tries go.opentelemetry.io/collector/client.Metadata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pkg/ottl processor/transform Transform processor
Projects
None yet
Development

No branches or pull requests

5 participants