[chore][docs/rfc] Add RFC on configuring confmap Providers #10121

evan-bradley · 2024-05-08T19:51:06Z

Description

Documents how we want to configure confmap Providers, with the default decision being to use URI fragments.

Per the SIG discussion on 2024-05-08, I have included as many alternatives as I could think of. I think a few of them have merit, but I don't think any of the options are without downsides.

…roviders

codecov · 2024-05-08T20:00:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.53%. Comparing base (c6b70a7) to head (d3485a9).
Report is 266 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #10121      +/-   ##
==========================================
+ Coverage   91.67%   92.53%   +0.85%     
==========================================
  Files         362      388      +26     
  Lines       16754    18317    +1563     
==========================================
+ Hits        15359    16949    +1590     
+ Misses       1056     1022      -34     
- Partials      339      346       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

docs/rfcs/confmap-provider-configuration.md

mx-psi · 2024-05-09T11:35:44Z

docs/rfcs/confmap-provider-configuration.md

+We could likely partially circumvent the key-value pair limitation by
+recursively calling confmap Providers to resolve files, env vars, HTTP URLs,
+etc. For example:
+
+```text
+https://config.com/config#refresh-interval=env:REFRESH_INTERVAL&headers=file:headers.yaml
+```


You can do this since #7504 with the following syntax:

https://config.com/config#refresh-interval=${env:REFRESH_INTERVAL}&headers=${file:headers.yaml}

At least, within a configuration file, I am not sure about CLI

I think the braces syntax only applies to URIs in files and doesn't work in the CLI. The reason I simplified the syntax a little was to avoid requiring users to escape characters that a shell may consume like $. On the other hand, we need some obvious way to indicate that these are URIs for extra config and not values in themselves. I'm not sure the way I have it written right now will suffice for that.

We also need to consider the way the substitution works for these options. If we have the confmap Resolver do resolution for embedded config URIs, headers.yaml would need to be marshaled into a single line, such that this:

https://config.com/config#refresh-interval=${env:REFRESH_INTERVAL}&headers=${file:headers.yaml}

would resolve to this:

https://config.com/config#refresh-interval=10s&headers=['Api-Token':'0xdeadbeef' 'Accept-Encoding':'text/yaml']

The Resolver would need to know when to "stringify" files like this vs. when to substitute the direct contents of a file. Or we just require users to write the stringified version, which simplifies things for us at the cost of worse UX since in my opinion this representation would be unpleasant to work with.

My assumption was that we implement custom unmarshaling logic for the URI fragment options that takes all of this into account. So ${file:headers.yaml} is unmarshaled directly into a map that is assigned to a Headers field on a Provider's config struct.

The reason I simplified the syntax a little was to avoid requiring users to escape characters that a shell may consume like $.

I am a bit hesitant about dropping $ both for consistency and because #refresh-interval={env:REFRESH_INTERVAL} is (I think?) a valid part of an URI, so I wouldn't want to prevent users from writing that.

The Resolver would need to know when to "stringify" files like this vs. when to substitute the direct contents of a file.

My expectation is that this would work the same way it does if you use an ${...} syntax in the middle of a string in the actual config.

I am a bit hesitant about dropping $ both for consistency and because #refresh-interval={env:REFRESH_INTERVAL} is (I think?) a valid part of an URI, so I wouldn't want to prevent users from writing that.

It would be a valid part of the URI, but the assumption here is that we reserve the fragment portion of URIs for our own use in upstream confmap components. Custom components would be free to use this portion of the URI as they wish, including forwarding it to a backend. I'm fine either way with whether recursive resolution requires $.

My expectation is that this would work the same way it does if you use an ${...} syntax in the middle of a string in the actual config.

I'm alright with that, that's probably the least surprising behavior here.

mx-psi · 2024-05-09T11:39:19Z

docs/rfcs/confmap-provider-configuration.md

+currently only takes valid URIs, and updating the format to accommodate this
+would require we adopt an unconventional format.
+
+### Separate flags to configure Providers per config URI


I am not entirely opposed to allow configuring things via flags, but if we ever do this I would do that with a flag provider that you can call like ${flag:my-flag-name} and it would give you the (YAML-parsed?) value of --my-flag-name=val (or possibly --some-namespace-here-my-flag-name).

This would have to piggyback onto some other solution that allows passing this ${flag:my-flag-name} value though

If we were to take this approach, how could we determine which flags are valid? It sounds like flag names could be arbitrary, so would passing --my-unsupported-flag cause an error, or just store a value in the flag provider that goes unused? Or am I entirely misunderstanding your suggestion?

If we were to take this approach, how could we determine which flags are valid? It sounds like flag names could be arbitrary, so would passing --my-unsupported-flag cause an error, or just store a value in the flag provider that goes unused? Or am I entirely misunderstanding your suggestion?

I was thinking about requiring the flags to be under some namespace (e.g. --flag-provider-my-flag-name). I am not sure what the behavior should be for unused flags under that namespace (ideally an error at config resolution time if they go unused).

Thanks for explaining.

The way I would see this working right now is to make config flags order-sensitive, like so:

--config=http://myurl.com --flag-provider-refresh-interval=10s --flag-provider-auth-header=${env:MYPASS} --config=/path/to/file.yaml [...]

We would then need to give the Provider's Retrieve method some kind of handle to obtain these flag values from. Do you think this would be viable, or do you have another solution you think may be better here?

docs/rfcs/confmap-provider-configuration.md

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>

TylerHelmuth

Can the RFC include any prior art on this type of configuration? My guess is that it'll be cli flags that apply to all Providers.

docs/rfcs/confmap-provider-configuration.md

evan-bradley · 2024-05-14T13:30:18Z

Can the RFC include any prior art on this type of configuration?

I spent some time thinking about this, but I'm not aware of anything that matches what we're doing closely enough to make sense. The best I can think of would be infrastructure as code tools like Terraform, Pulumi, or CloudFormation, but I think what they are doing is sufficiently different that it's hard to use them as a basis. If there are other tools or approaches you think would be worth investigating, let me know and I can take a look.

I can also do a writeup on how our configuration fits into the larger landscape of infrastructure configuration approaches if you think it would be beneficial to take a broader look at our whole approach, but didn't want to detract from the more practical discussion at hand of how to do things like authenticate requests in our HTTP providers.

github-actions · 2024-05-29T03:15:58Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

…iders

mx-psi · 2024-06-05T11:04:15Z

docs/rfcs/confmap-provider-configuration.md

+1. Broadly applicable to all current upstream providers.
+2. Can be used as a suggestion for configuring custom providers.
+3. Is consistent between Providers.
+4. Is configurable per config URI.


I feel like a requirement should be that any valid URI today is expressible after this change (possibly with some escaping mechanism)

Agreed. It looks like something like ## wouldn't be a valid normal URI fragment according to RFC 3986, so that may be a good default for at least the URI option. I think the other options won't need to mess with the URI.

mx-psi · 2024-06-05T11:08:58Z

docs/rfcs/confmap-provider-configuration.md

+We could likely partially circumvent the key-value pair limitation by
+recursively calling confmap Providers to resolve files, env vars, HTTP URLs,
+etc. For example:
+
+```text
+https://config.com/config#refresh-interval=env:REFRESH_INTERVAL&headers=file:headers.yaml
+```


The reason I simplified the syntax a little was to avoid requiring users to escape characters that a shell may consume like $.

I am a bit hesitant about dropping $ both for consistency and because #refresh-interval={env:REFRESH_INTERVAL} is (I think?) a valid part of an URI, so I wouldn't want to prevent users from writing that.

The Resolver would need to know when to "stringify" files like this vs. when to substitute the direct contents of a file.

My expectation is that this would work the same way it does if you use an ${...} syntax in the middle of a string in the actual config.

mx-psi · 2024-06-05T11:13:11Z

docs/rfcs/confmap-provider-configuration.md

+currently only takes valid URIs, and updating the format to accommodate this
+would require we adopt an unconventional format.
+
+### Separate flags to configure Providers per config URI


If we were to take this approach, how could we determine which flags are valid? It sounds like flag names could be arbitrary, so would passing --my-unsupported-flag cause an error, or just store a value in the flag provider that goes unused? Or am I entirely misunderstanding your suggestion?

I was thinking about requiring the flags to be under some namespace (e.g. --flag-provider-my-flag-name). I am not sure what the behavior should be for unused flags under that namespace (ideally an error at config resolution time if they go unused).

mx-psi · 2024-06-05T11:19:17Z

docs/rfcs/confmap-provider-configuration.md

+Instead of configuring Providers inside their URIs, we could add an extra
+section to the string passed to `--config` and configure how a URI is retrieved
+through that config. This approach is undesirable because the config flag
+currently only takes valid URIs, and updating the format to accommodate this
+would require we adopt an unconventional format.


The more I read, the more I would prefer something like this: no conflict, it's clear what it means, you can configure per URI... This seems easy enough in the config itself (e.g. we could have ${http://example.com}[refresh-interval=10s] or something like this), but I don't see now how to put that into practice for the CLI right now 🤔

I would see it work like this for each format:

File:

val: ${http://example.com}[refresh-interval=10s]

CLI:

--config=http://example.com --resolution-options=refresh-interval=10s&auth-header=Api-Token

If we don't do a config flag, we'll have to find some character that's not supported here and use it as a delimiter between the URI and options. Looking at the gen-delims and sub-delims character classes, |, \, {, }, ^, and ` could be options.

github-actions · 2024-07-05T03:15:59Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

)  #### Description  Validate URLs before fetching. #### Link to tracking issue Fixes #10468, Relates to #10121

mx-psi

If we were to design this from scratch, something like ${<provider>:URI}[params] would be my desired syntax, but I am worried that this would be too much of a breakage to do today (the recent confmap changes show that people do actively use a lot of edge cases).

I think the next best solution is URI fragments. My main hesitation there is that, while this works for HTTP(s) providers (it's a client-side only feature), this may not work for other URIs (e.g. it's a valid character on file names, other systems may use fragments for something meaningful). However, I feel like this breakage is much smaller for the currently existing providers (# is not a common character in filenames). I would like to propose the following plan:

We rework this RFC to propose fragments as the solution, with no mention about what the specific syntax is (e.g. we table the adding $ or not adding it). We add an explicit escaping mechanism for #.
For all providers that currently allow # in the URIs (possibly others in contrib), we implement # escaping and start logging a warning if an unescaped # is present.
After a few releases we make having a # in a URI a hard error.

If we agree on this plan and the PR is merged with these changes, this unblocks confmap 1.0 (we would have to wait to make the specific providers 1.0 until the cycle above is finished).

Whenever we want to add the first parameter we will have to restart some of the discussions here, but this can be done after 1.0 is completed.

TylerHelmuth · 2024-07-26T15:37:16Z

Is there an option to support a ${<provider>:URI}[params] style if we introduce a new specialized flag? Like instead of --config its --fancy-config? That way we get the best syntax without breaking users?

evan-bradley · 2024-07-26T16:01:30Z

I think at that point we should just have an alpha-stability feature gate that we use to determine whether params are supported. I see the benefit there, but I think it would also be confusing to have to opt-in per-file. In cases where I forget to use --fancy-config, I would get weird syntax errors (or worse, no error at all) without a clear indicator of why.

I'm also a bit nervous about committing long-term to maintaining two ways to parse the config.

evan-bradley · 2024-08-06T15:23:41Z

This is replaced by #10776.

…0776) #### Description This is a stripped-down version of #10121 that eschews implementation details in favor of answering questions about _what_ we want to achieve instead of how we will achieve it. A section from that PR has been reproduced here to cover a few approaches for future discussions.

[chore][docs/rfc] Add RFC to cement decision on configuring confmap P…

c805b0c

…roviders

evan-bradley requested review from a team, codeboten, mx-psi and dmitryax May 8, 2024 19:51

Clarify HTTP use case

3ad18d0

mx-psi reviewed May 9, 2024

View reviewed changes

evan-bradley and others added 4 commits May 9, 2024 08:31

Update docs/rfcs/confmap-provider-configuration.md

d7ce426

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>

Address PR feedback

62344ca

Add note about in-file config

a2ba849

Callout for env provider

a1dee8f

TylerHelmuth reviewed May 13, 2024

View reviewed changes

docs/rfcs/confmap-provider-configuration.md Show resolved Hide resolved

github-actions bot added the Stale label May 29, 2024

Describe breaking changes and add a section on configuring whole Prov…

f566140

…iders

evan-bradley removed the Stale label May 29, 2024

mx-psi reviewed Jun 5, 2024

View reviewed changes

Require all URIs to continue to be expressible

d3485a9

mx-psi mentioned this pull request Jun 26, 2024

Validate URIs on http and https providers #10468

Closed

github-actions bot added the Stale label Jul 5, 2024

mx-psi added help wanted Good issue for contributors to OpenTelemetry Service to pick up needs-review-from-stability-wg labels Jul 17, 2024

github-actions bot removed the Stale label Jul 18, 2024

mx-psi added this to the go.opentelemetry.io/collector/confmap 1.0 milestone Jul 23, 2024

mx-psi mentioned this pull request Jul 24, 2024

[confmap/provider/http(s)provider] Validate URIs before fetching #10721

Merged

mx-psi reviewed Jul 25, 2024

View reviewed changes

evan-bradley mentioned this pull request Jul 31, 2024

[chore][docs/rfc] Add RFC on confmap Provider configuration goals #10776

Merged

evan-bradley closed this Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[chore][docs/rfc] Add RFC on configuring confmap Providers #10121

[chore][docs/rfc] Add RFC on configuring confmap Providers #10121

evan-bradley commented May 8, 2024

codecov bot commented May 8, 2024 •

edited

Loading

mx-psi May 9, 2024

evan-bradley May 9, 2024

mx-psi Jun 5, 2024

evan-bradley Jun 11, 2024

mx-psi May 9, 2024

evan-bradley May 9, 2024

mx-psi Jun 5, 2024

evan-bradley Jun 11, 2024

TylerHelmuth left a comment

evan-bradley commented May 14, 2024

github-actions bot commented May 29, 2024

mx-psi Jun 5, 2024

evan-bradley Jun 11, 2024

mx-psi Jun 5, 2024

mx-psi Jun 5, 2024

mx-psi Jun 5, 2024

evan-bradley Jun 11, 2024

github-actions bot commented Jul 5, 2024

mx-psi left a comment

TylerHelmuth commented Jul 26, 2024

evan-bradley commented Jul 26, 2024

evan-bradley commented Aug 6, 2024

[chore][docs/rfc] Add RFC on configuring confmap Providers #10121

[chore][docs/rfc] Add RFC on configuring confmap Providers #10121

Conversation

evan-bradley commented May 8, 2024

Description

codecov bot commented May 8, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TylerHelmuth left a comment

Choose a reason for hiding this comment

evan-bradley commented May 14, 2024

github-actions bot commented May 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jul 5, 2024

mx-psi left a comment

Choose a reason for hiding this comment

TylerHelmuth commented Jul 26, 2024

evan-bradley commented Jul 26, 2024

evan-bradley commented Aug 6, 2024

codecov bot commented May 8, 2024 •

edited

Loading