Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chore][docs/rfc] Add RFC on configuring confmap Providers #10121

Conversation

evan-bradley
Copy link
Contributor

Description

Documents how we want to configure confmap Providers, with the default decision being to use URI fragments.

Per the SIG discussion on 2024-05-08, I have included as many alternatives as I could think of. I think a few of them have merit, but I don't think any of the options are without downsides.

@evan-bradley evan-bradley requested review from a team, codeboten, mx-psi and dmitryax May 8, 2024 19:51
Copy link

codecov bot commented May 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.53%. Comparing base (c6b70a7) to head (d3485a9).
Report is 266 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10121      +/-   ##
==========================================
+ Coverage   91.67%   92.53%   +0.85%     
==========================================
  Files         362      388      +26     
  Lines       16754    18317    +1563     
==========================================
+ Hits        15359    16949    +1590     
+ Misses       1056     1022      -34     
- Partials      339      346       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

docs/rfcs/confmap-provider-configuration.md Show resolved Hide resolved
docs/rfcs/confmap-provider-configuration.md Outdated Show resolved Hide resolved
docs/rfcs/confmap-provider-configuration.md Outdated Show resolved Hide resolved
docs/rfcs/confmap-provider-configuration.md Outdated Show resolved Hide resolved
Comment on lines +111 to +117
We could likely partially circumvent the key-value pair limitation by
recursively calling confmap Providers to resolve files, env vars, HTTP URLs,
etc. For example:

```text
https://config.com/config#refresh-interval=env:REFRESH_INTERVAL&headers=file:headers.yaml
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do this since #7504 with the following syntax:

https://config.com/config#refresh-interval=${env:REFRESH_INTERVAL}&headers=${file:headers.yaml}

At least, within a configuration file, I am not sure about CLI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the braces syntax only applies to URIs in files and doesn't work in the CLI. The reason I simplified the syntax a little was to avoid requiring users to escape characters that a shell may consume like $. On the other hand, we need some obvious way to indicate that these are URIs for extra config and not values in themselves. I'm not sure the way I have it written right now will suffice for that.

We also need to consider the way the substitution works for these options. If we have the confmap Resolver do resolution for embedded config URIs, headers.yaml would need to be marshaled into a single line, such that this:

https://config.com/config#refresh-interval=${env:REFRESH_INTERVAL}&headers=${file:headers.yaml}

would resolve to this:

https://config.com/config#refresh-interval=10s&headers=['Api-Token':'0xdeadbeef' 'Accept-Encoding':'text/yaml']

The Resolver would need to know when to "stringify" files like this vs. when to substitute the direct contents of a file. Or we just require users to write the stringified version, which simplifies things for us at the cost of worse UX since in my opinion this representation would be unpleasant to work with.

My assumption was that we implement custom unmarshaling logic for the URI fragment options that takes all of this into account. So ${file:headers.yaml} is unmarshaled directly into a map that is assigned to a Headers field on a Provider's config struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I simplified the syntax a little was to avoid requiring users to escape characters that a shell may consume like $.

I am a bit hesitant about dropping $ both for consistency and because #refresh-interval={env:REFRESH_INTERVAL} is (I think?) a valid part of an URI, so I wouldn't want to prevent users from writing that.

The Resolver would need to know when to "stringify" files like this vs. when to substitute the direct contents of a file.

My expectation is that this would work the same way it does if you use an ${...} syntax in the middle of a string in the actual config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit hesitant about dropping $ both for consistency and because #refresh-interval={env:REFRESH_INTERVAL} is (I think?) a valid part of an URI, so I wouldn't want to prevent users from writing that.

It would be a valid part of the URI, but the assumption here is that we reserve the fragment portion of URIs for our own use in upstream confmap components. Custom components would be free to use this portion of the URI as they wish, including forwarding it to a backend. I'm fine either way with whether recursive resolution requires $.

My expectation is that this would work the same way it does if you use an ${...} syntax in the middle of a string in the actual config.

I'm alright with that, that's probably the least surprising behavior here.

currently only takes valid URIs, and updating the format to accommodate this
would require we adopt an unconventional format.

### Separate flags to configure Providers per config URI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely opposed to allow configuring things via flags, but if we ever do this I would do that with a flag provider that you can call like ${flag:my-flag-name} and it would give you the (YAML-parsed?) value of --my-flag-name=val (or possibly --some-namespace-here-my-flag-name).

This would have to piggyback onto some other solution that allows passing this ${flag:my-flag-name} value though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to take this approach, how could we determine which flags are valid? It sounds like flag names could be arbitrary, so would passing --my-unsupported-flag cause an error, or just store a value in the flag provider that goes unused? Or am I entirely misunderstanding your suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to take this approach, how could we determine which flags are valid? It sounds like flag names could be arbitrary, so would passing --my-unsupported-flag cause an error, or just store a value in the flag provider that goes unused? Or am I entirely misunderstanding your suggestion?

I was thinking about requiring the flags to be under some namespace (e.g. --flag-provider-my-flag-name). I am not sure what the behavior should be for unused flags under that namespace (ideally an error at config resolution time if they go unused).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining.

The way I would see this working right now is to make config flags order-sensitive, like so:

--config=http://myurl.com
--flag-provider-refresh-interval=10s
--flag-provider-auth-header=${env:MYPASS}
--config=/path/to/file.yaml
[...]

We would then need to give the Provider's Retrieve method some kind of handle to obtain these flag values from. Do you think this would be viable, or do you have another solution you think may be better here?

docs/rfcs/confmap-provider-configuration.md Show resolved Hide resolved
Copy link
Member

@TylerHelmuth TylerHelmuth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the RFC include any prior art on this type of configuration? My guess is that it'll be cli flags that apply to all Providers.

docs/rfcs/confmap-provider-configuration.md Show resolved Hide resolved
@evan-bradley
Copy link
Contributor Author

Can the RFC include any prior art on this type of configuration?

I spent some time thinking about this, but I'm not aware of anything that matches what we're doing closely enough to make sense. The best I can think of would be infrastructure as code tools like Terraform, Pulumi, or CloudFormation, but I think what they are doing is sufficiently different that it's hard to use them as a basis. If there are other tools or approaches you think would be worth investigating, let me know and I can take a look.

I can also do a writeup on how our configuration fits into the larger landscape of infrastructure configuration approaches if you think it would be beneficial to take a broader look at our whole approach, but didn't want to detract from the more practical discussion at hand of how to do things like authenticate requests in our HTTP providers.

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label May 29, 2024
1. Broadly applicable to all current upstream providers.
2. Can be used as a suggestion for configuring custom providers.
3. Is consistent between Providers.
4. Is configurable per config URI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a requirement should be that any valid URI today is expressible after this change (possibly with some escaping mechanism)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. It looks like something like ## wouldn't be a valid normal URI fragment according to RFC 3986, so that may be a good default for at least the URI option. I think the other options won't need to mess with the URI.

Comment on lines +111 to +117
We could likely partially circumvent the key-value pair limitation by
recursively calling confmap Providers to resolve files, env vars, HTTP URLs,
etc. For example:

```text
https://config.com/config#refresh-interval=env:REFRESH_INTERVAL&headers=file:headers.yaml
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I simplified the syntax a little was to avoid requiring users to escape characters that a shell may consume like $.

I am a bit hesitant about dropping $ both for consistency and because #refresh-interval={env:REFRESH_INTERVAL} is (I think?) a valid part of an URI, so I wouldn't want to prevent users from writing that.

The Resolver would need to know when to "stringify" files like this vs. when to substitute the direct contents of a file.

My expectation is that this would work the same way it does if you use an ${...} syntax in the middle of a string in the actual config.

currently only takes valid URIs, and updating the format to accommodate this
would require we adopt an unconventional format.

### Separate flags to configure Providers per config URI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to take this approach, how could we determine which flags are valid? It sounds like flag names could be arbitrary, so would passing --my-unsupported-flag cause an error, or just store a value in the flag provider that goes unused? Or am I entirely misunderstanding your suggestion?

I was thinking about requiring the flags to be under some namespace (e.g. --flag-provider-my-flag-name). I am not sure what the behavior should be for unused flags under that namespace (ideally an error at config resolution time if they go unused).

Comment on lines +144 to +148
Instead of configuring Providers inside their URIs, we could add an extra
section to the string passed to `--config` and configure how a URI is retrieved
through that config. This approach is undesirable because the config flag
currently only takes valid URIs, and updating the format to accommodate this
would require we adopt an unconventional format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I read, the more I would prefer something like this: no conflict, it's clear what it means, you can configure per URI... This seems easy enough in the config itself (e.g. we could have ${http://example.com}[refresh-interval=10s] or something like this), but I don't see now how to put that into practice for the CLI right now 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would see it work like this for each format:

File:

val: ${http://example.com}[refresh-interval=10s]

CLI:

--config=http://example.com
--resolution-options=refresh-interval=10s&auth-header=Api-Token

If we don't do a config flag, we'll have to find some character that's not supported here and use it as a delimiter between the URI and options. Looking at the gen-delims and sub-delims character classes, |, \, {, }, ^, and ` could be options.

Copy link
Contributor

github-actions bot commented Jul 5, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jul 5, 2024
@mx-psi mx-psi added help wanted Good issue for contributors to OpenTelemetry Service to pick up needs-review-from-stability-wg labels Jul 17, 2024
@github-actions github-actions bot removed the Stale label Jul 18, 2024
mx-psi added a commit that referenced this pull request Jul 25, 2024
)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
<!-- Issue number if applicable -->

Validate URLs before fetching. 

#### Link to tracking issue
Fixes #10468, Relates to #10121
Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to design this from scratch, something like ${<provider>:URI}[params] would be my desired syntax, but I am worried that this would be too much of a breakage to do today (the recent confmap changes show that people do actively use a lot of edge cases).

I think the next best solution is URI fragments. My main hesitation there is that, while this works for HTTP(s) providers (it's a client-side only feature), this may not work for other URIs (e.g. it's a valid character on file names, other systems may use fragments for something meaningful). However, I feel like this breakage is much smaller for the currently existing providers (# is not a common character in filenames). I would like to propose the following plan:

  1. We rework this RFC to propose fragments as the solution, with no mention about what the specific syntax is (e.g. we table the adding $ or not adding it). We add an explicit escaping mechanism for #.
  2. For all providers that currently allow # in the URIs (possibly others in contrib), we implement # escaping and start logging a warning if an unescaped # is present.
  3. After a few releases we make having a # in a URI a hard error.

If we agree on this plan and the PR is merged with these changes, this unblocks confmap 1.0 (we would have to wait to make the specific providers 1.0 until the cycle above is finished).

Whenever we want to add the first parameter we will have to restart some of the discussions here, but this can be done after 1.0 is completed.

@TylerHelmuth
Copy link
Member

Is there an option to support a ${<provider>:URI}[params] style if we introduce a new specialized flag? Like instead of --config its --fancy-config? That way we get the best syntax without breaking users?

@evan-bradley
Copy link
Contributor Author

I think at that point we should just have an alpha-stability feature gate that we use to determine whether params are supported. I see the benefit there, but I think it would also be confusing to have to opt-in per-file. In cases where I forget to use --fancy-config, I would get weird syntax errors (or worse, no error at all) without a clear indicator of why.

I'm also a bit nervous about committing long-term to maintaining two ways to parse the config.

@evan-bradley
Copy link
Contributor Author

This is replaced by #10776.

dmitryax pushed a commit that referenced this pull request Aug 7, 2024
…0776)

#### Description

This is a stripped-down version of
#10121
that eschews implementation details in favor of answering questions
about _what_ we want to achieve instead of how we will achieve it. A
section from that PR has been reproduced here to cover a few approaches
for future discussions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Good issue for contributors to OpenTelemetry Service to pick up
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants