-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automate reference documentation as YAML or JSON #24189
Comments
As it stands, I think we are in a slow process to automate whatever parts of components we can generate using templates.
Ideally, with this yaml fragment, we can drive the config.go file, and the README, to contain a complete table of all config options. This comment is the closest to have with latest progress. |
That sounds great, @atoulme. Would this apply to metrics as well? For settings, I was thinking of a schema like this: name: <name_of_entity>
fields:
- name: <field_name>
value_type: <data_type>
default: <default_value>
description: |
<description> For metrics, thinking also of APM instrumentations, it'd be something like: name: <name_of_component_or_instrumentation>
metrics:
<name_of_metric>:
status: <default|custom|arbitrary_vendor_value>
enabled: <true|false>
type: <sum|gauge|counter|histogram|others>
value_type: <int|string|...>
monotonic: <true|false>
aggregation: <cumulative|...>
unit: <unit_of_measurement>
description: <metric_description>
attributes: [list_of_attributes]
dimensions:
<name_of_dimension>:
description: <description>
properties:
resource_attributes:
<name_resource_attribute>:
description: <description>
enabled: <true|false>
value_type: <data_type>
attributes:
<name_attribute>:
description: <description>
value_type: <data_type>
enum: [possible_values_list] |
It's unclear to me how to model the
It's also unclear to me how do we model |
I recently used One limitations of the current generated metadata is that it does not capture whether certain fields are optional or required. It also is unable to handle custom validation logic. On the validation front - because ConfigValidator can contain arbitrary go code, its infeasible to convert it to a more declarative representation like jsonschema. Apologies if this has been brought up in the past but curious if there was discussion in flipping the order and authoring configuration in something like jsonschema and generating go code based on it? Json schema seems to cover most of the component configurations I've seen in the wild, including the validations. If components really need arbitrary validation, that can still be kept as an escape hatch and documented in the jsonschema. |
Trying to galvanize this one a bit more. @chalin WDYT would be required to steer this initiative forward? Is there anything the docs SIG could do here? |
I've done this sort of work in past projects (jsonschema -> code). If we want to move forward with this, happy to step in |
@kevinslin That sounds fantastic. What do you suggest? |
High level flow:
In terms of arbitrary validation rules:
Ideally, we'd keep the exact same interfaces as exist today. Just generated via jsonschema instead of manually written. Recently discovered that the SDK team is starting to do similar work for generating SDK configuration from jsonschema. |
@kevinslin would be great if you can bring this up to one of the Collector SIG meetings (see here for the current times). Just add it to the agenda of a meeting you can attend and we can help you discuss the plan to ensure we are all aligned. I am not familiar enough with jsonschema to answer this but I would like to see the questions I asked on #24189 (comment) resolved before moving forward to avoid being blocked mid-way |
I think there was some effort somewhat along the line of this in #13384 |
As a data point, it appears we have roughly 20 custom unmarshal functions. |
One other note: Some components have configuration structs that live in another repository. E.g.:
|
I was thinking about this a bit more after the collector sig this morning. By chance I was listening to a podcast that had some talking points on markup languages at the end and it made me wonder. Do we really care what markup language is format is used to generate the config struct and thus the documentation? Do we only have to pick one, whether it be Do we want to take a higher level approach to this problem and instead make the source of truth configurable? The For example, component owner A wants to use JsonSchema and component owner b wants to use cue. It's the format they and their team are most familiar with. Should we cater to that scenario? Ignoring my big comment that just asks a lot of questions and doesn't propose much other than a scope explosion to a small problem....I do like jsonschema and think it can be quite powerful. |
Created an initial proposal for authoring component configuration using declarative schema. Would love to get any feedback, either in this issue or on the doc 🙏 ☝️ |
@kevinslin I love it. Just curious: how hard would it be to extend this model to trace instrumentation and other projects? |
@theletterf See https://github.com/open-telemetry/opentelemetry-configuration for that :) |
@mx-psi Looks great, but how would Kevin's proposal intersect with the above? Would it be building upon it? Would then be the responsibility of each project maintainer to adopt those conventions and generate the files? |
AIUI these are independent:
The only point of overlap when it comes to the Collector is configuring the Collector's own telemetry. We are working on that on open-telemetry/opentelemetry-collector/issues/7532 |
Stake for tech writers and documentarians collaborating with the OTel projects is having a mostly similar mechanisms for producing and consuming reference documentation. Despite the differences in usage scenarios and stack, I think it'd be great to align as much as possible in the way the output is produced and presented—settings and metrics can be described in very similar ways after all. Not sure if my concern is clear enough though? |
This makes sense. @codeboten is involved both in the Configuration WG and in the Collector so we can use his input to ensure that we are aligned in that sense.
Not sure I understand this second part. Is this about what way to express things in jsonschema given several options to do so? Is this about having a similar configuration schema for configuration sections configuring the same thing? |
More like the second. For example, settings: whether they are Collector components's settings or instrumentation settings, they could share the same structure with few non-overlapping extensions: name: <name_of_entity>
fields:
- name: <field_name>
value_type: <data_type>
default: <default_value>
description: |
<description> |
created an initial pr to go over some additional design decisions that have come up while doing the implementation > #27003 |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
Reference documentation for each component is hard to come by, update, and produce. Settings and metrics are, by far, the user-facing elements that might change more often between releases. This poses significant overhead on anyone trying to keep documentation up-to-date both upstream and downstream.
As suggested, hinted, or tried in #23054, open-telemetry/opentelemetry-collector#7679, #20509, #15233, or #22962, I think we could generate reference documentation for each component as YAML files, using mdatagen (?) and pulling description and other information from the code.
The YAML reference could then be used to generate Markdown files or even the README file itself. It could also be used downstream by distributions, thus faithfully passing on documentation on each component. Elements that could be automated include:
An example is what Splunk has been doing here: https://github.com/splunk/collector-config-tools/tree/main/cfg-metadata
The text was updated successfully, but these errors were encountered: