Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack/plugin/otel: introduce x-pack-otel plugin #111091

Merged
merged 53 commits into from
Aug 19, 2024

Conversation

gregkalapos
Copy link
Contributor

@gregkalapos gregkalapos commented Jul 19, 2024

Solves #104731

This PR adds mappings for OpenTelemetry data via a new plugin otel-data

The PR is mainly based on:

What happens in the PR:

  • Created YamlTemplateRegistry as an abstract base class that loads mappings from YAML. The code of this class comes from APMIndexTemplateRegistry which now derives from YamlTemplateRegistry. The shared code is now moved to core.
  • Added otel-data plugin with OtelIndexTemplateRegistry which also derives from YamlTemplateRegistry.
  • Moved the mappings into YAML files from Add OpenTelemetry mappings #104455.

TODOs:

  • Discuss if we want to share some mappings between APM and OTel (candidates: all traces@ files from Add OpenTelemetry mappings #104455), those are currently not yet in the PR -> Discussed: not at this point Update: we did that - now some some basic things are shared across otel and apm.
  • Discuss data stream lifecycle for OTel mappings (right now, nothing is set in the PR) -> Discussed: Since these mappings are shared across multiple OTel use-cases (e.g. system metrics and APM, for now we don't set a default ILM/DSL, since it's hard to come up with a default. Using what APM does today may not be good for integrations - so for now no DSL is set in these templates).
  • Adding more tests to otel-data

@elasticsearchmachine elasticsearchmachine added v8.16.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jul 19, 2024
@felixbarny felixbarny linked an issue Jul 24, 2024 that may be closed by this pull request
description: default logs template installed by x-pack
managed: true
composed_of:
- logs@mappings

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure we want to include the component templates from the non-otel templates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would allow us to share very basic mappings and not duplicate those.

In this specific one, the only thing that we pull in is to set data_stream.type to "value": "logs" - the remaining things from logs@mappings are already covered in otel@mappings. I'd say setting data_stream.type to "value": "logs" is something which makes sense to share across log related templates.

I don't have much history on how often these mappings from core are changed - or if there would be any risk here, but I think this would be a good way to share these very basic mappings and also use it for OTel.

We discussed that potentially we could also share a few things for traces between otel and apm-data - that'd be also done via a mapping in core. But that's outside this PR.

Copy link

@strawgate strawgate Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also are explicitly setting data_stream.type in the logs-otel@mappings though? and in this template?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an outcome from this discussion: #111091 (comment).

@felixbarny
Copy link
Member

@elasticmachine update branch

@elasticmachine
Copy link
Collaborator

expected head sha didn’t match current head ref.

@gregkalapos
Copy link
Contributor Author

why are these being managed in yaml instead of matching the existing precedent of having them in json? I find yaml much more error-prone for unintentional changes, since it is heavily indentation reliant.

@dakrone that's my doing. We have historically had far more issues with business logic (wrong mappings, dynamic templates, etc.) than with syntax (indentation, "Norway problem", etc.). So we're optimising for readability and the ability to comment. Comprehensive (YAML) REST tests will cover basic syntax issues.

Yeah, so I took yaml, because apm-data already used yaml and we also discussed with @felixbarny and we said we'd move the json from #104455 to yaml. I'm personally fine with both, but my 2 cents is that apm and otel should use the same, because it'll be maintained by the same team. So with that I'd have a slight preference to move on with yaml for consistency.

There are already tests here that load these mappings and checks for them, so I'm less worried about accidental wrong indentation.

Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @gregkalapos 🙏 Just one minor comment.

@gregkalapos I think you raise a good question in https://github.com/elastic/elasticsearch/pull/111091/files/a6baf4644924ea4929ae2ab58ebac0b78f936e62#r1714175422 - it's also unclear to me if a user should be able to override the all_strings_to_keywords dynamic template. I can't think of any reasons.

If it shouldn't be possible to override, then I think it would be a good candidate for going into the index template. Same goes for top-level fields, but we can iterate on that in a followup.

felixbarny and others added 4 commits August 13, 2024 11:47
Add test to verify that @Custom template can add dynamic templates with a higher precedence
Remove summary_gauge and summary_counter since they are covered by summary_metrics
@axw
Copy link
Member

axw commented Aug 14, 2024

@gregkalapos I see you removed summary_gauge & summary_counter, but summary_metrics doesn't have time_series_metric. Should we add that? I think with "counter"?

@felixbarny
Copy link
Member

@gregkalapos I see you removed summary_gauge & summary_counter, but summary_metrics doesn't have time_series_metric. Should we add that? I think with "counter"?

I'm not sure if it makes much of a difference. I suppose it's relevant for downsampling. I'm not sure if counter or gauge is most appropriate here. The reason we removed summary_gauge & summary_counter is that there's no temporality for summary metrics. It's also not always monotonically increasing as it depends on the start and end timestamp. So you can effectively have either temporality without really having an indication of what the temporality is. At least that's my understanding of it @lahsivjar, please correct me if I'm wrong.

For now, I'm in inclined to just leave out time_series_metric and then potentially add one in a follow-up.

@axw
Copy link
Member

axw commented Aug 14, 2024

I'm not sure if it makes much of a difference. I suppose it's relevant for downsampling. I'm not sure if counter or gauge is most appropriate here. The reason we removed summary_gauge & summary_counter is that there's no temporality for summary metrics. It's also not always monotonically increasing as it depends on the start and end timestamp. So you can effectively have either temporality without really having an indication of what the temporality is. At least that's my understanding of it @lahsivjar, please correct me if I'm wrong.

For now, I'm in inclined to just leave out time_series_metric and then potentially add one in a follow-up.

Sounds very reasonable to me.

…ec/test/20_logs.tests.yml

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
@gregkalapos
Copy link
Contributor Author

I opened #111877 to collect follow up issues. Feel free to add to it if I missed something.

The new otel-data plugin is disabled by default. Since there are multiple things going in this PR - as discussed in other meetings - we should cut here and merge if this is ok to go. With that, other teams can test this more easily. #111877 will address remaining non-blockers.

No lifecycle needed for OTel at this point
@dakrone dakrone self-assigned this Aug 15, 2024
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the Java parts, I can't speak to the content of the mappings (it looks like other Obs folks have already reviewed those parts)

@gregkalapos
Copy link
Contributor Author

@elasticmachine update branch

@gregkalapos gregkalapos merged commit 4e1a84c into elastic:main Aug 19, 2024
15 checks passed
@gregkalapos gregkalapos deleted the otel_plugin branch August 19, 2024 07:24
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Aug 19, 2024
* upstream/main:
  Fail `indexDocs()` on rejection (elastic#111962)
  Move repo analyzer to its own package (elastic#111963)
  Add generated evaluators for DateNanos conversion functions (elastic#111961)
  Clean the last traces from global retention in templates (elastic#111669)
  Fix known issue docs for elastic#111866 (elastic#111956)
  x-pack/plugin/otel: introduce x-pack-otel plugin (elastic#111091)
  Improve reaction to blob store corruptions (elastic#111954)
  Introduce `StreamingXContentResponse` (elastic#111933)
  Revert "Add 8.15.0 known issue for memory locking in Windows (elastic#111949)"
  Test get-snapshots API with missing details (elastic#111903)
  Add 8.15.0 known issue for memory locking in Windows (elastic#111949)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Sep 4, 2024
* Add YamlTemplateRegistry and OtelIndexTemplateRegistry with resource YAML files

* Fix traces-otel template

* Adding first yml tests

* Base APMIndexTemplateRegistry on YamlTemplateRegistry

* Update OTelPlugin.java

* Update APMIndexTemplateRegistry.java

* Update YamlIngestPipelineConfig.java

* Adding traces tests

* Update x-pack/plugin/otel-data/src/main/resources/component-templates/ecs-tsdb@mappings.yaml

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>

* Add mapper-version

* Fix code-style

* Rename `status.status_code` to `status.code`

* Update otel@mappings.yaml

Revert back to date due to missing support in ES|QL for date_nanos

* Move dynamic_templates to metrics@mappings in core

* Run gradlew :x-pack:plugin:core:spotlessApply

* Update x-pack/plugin/otel-data/src/main/resources/component-templates/metrics-otel@mappings.yaml

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>

* Update 20_metic_tests.yml

Workaround for TSDB timestamp issue: we push a custom template with higher priority and set  time_series.start_time.

* Update CODEOWNERS

Adding obs-ds-intake-services as owner of the new otel-data plugin.

Since we had some changes, also updating the owner of apm-data to the same team.

* Change dynamic: strict to false

* Skip "Reject invalid top level field" test

* Update 20_metic_tests.yml

* Add boolean as dimension test (skipping it for now)

* Add booleans_to_keywords and enable corresponding test

* Remove processor.event top level mapping

Reason: for metrics and logs we can rely on the name of the datastream. For spans vs. transactions there are other fields we can use.

* Remove booleans_to_keywords

Because booleans are supported now as dimension on TSDB

* Add alias service.language.name -> telemetry.sdk.language

* cleanup

* Update README.md

* Update README.md

* Update docs/changelog/111091.yaml

* Move traces@settings and traces@mappings to core

* Update traces-otel@mappings.yaml

* Review feedback

* Adapt `match` style in tests

* Update docs/changelog/111091.yaml

* Apply suggestions from code review

Co-authored-by: Vishal Raj <vishal.raj@elastic.co>

* Update x-pack/plugin/otel-data/src/main/resources/component-templates/traces-otel@mappings.yaml

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>

* Changing trace_flags to long

Related discussion: elastic#111091 (comment)

* Remove trace_flags

see: elastic/opentelemetry-dev#368 (review)

* Apply suggestions from code review

Co-authored-by: Andrew Wilkins <axwalk@gmail.com>

* Review feedback

* Add store_array_source for span links

* Define constant `data_stream.type` in `template.yaml`s

* Create package-info.java

* Move ecs-tsdb@mappings to index template

Add test to verify that @Custom template can add dynamic templates with a higher precedence

* Update metrics@mappings.json

Remove summary_gauge and summary_counter since they are covered by summary_metrics

* Move clusterService.getClusterSettings().addSettingsUpdateConsumer to registry

* Fix code-style

* Update x-pack/plugin/otel-data/src/yamlRestTest/resources/rest-api-spec/test/20_logs.tests.yml

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>

* Enable logsdb

* Update traces@settings.json

No lifecycle needed for OTel at this point

---------

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Co-authored-by: Vishal Raj <vishal.raj@elastic.co>
Co-authored-by: Andrew Wilkins <axwalk@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Felix Barnsteiner <felix.barnsteiner@elastic.co>
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Sep 5, 2024
* Add YamlTemplateRegistry and OtelIndexTemplateRegistry with resource YAML files

* Fix traces-otel template

* Adding first yml tests

* Base APMIndexTemplateRegistry on YamlTemplateRegistry

* Update OTelPlugin.java

* Update APMIndexTemplateRegistry.java

* Update YamlIngestPipelineConfig.java

* Adding traces tests

* Update x-pack/plugin/otel-data/src/main/resources/component-templates/ecs-tsdb@mappings.yaml

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>

* Add mapper-version

* Fix code-style

* Rename `status.status_code` to `status.code`

* Update otel@mappings.yaml

Revert back to date due to missing support in ES|QL for date_nanos

* Move dynamic_templates to metrics@mappings in core

* Run gradlew :x-pack:plugin:core:spotlessApply

* Update x-pack/plugin/otel-data/src/main/resources/component-templates/metrics-otel@mappings.yaml

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>

* Update 20_metic_tests.yml

Workaround for TSDB timestamp issue: we push a custom template with higher priority and set  time_series.start_time.

* Update CODEOWNERS

Adding obs-ds-intake-services as owner of the new otel-data plugin.

Since we had some changes, also updating the owner of apm-data to the same team.

* Change dynamic: strict to false

* Skip "Reject invalid top level field" test

* Update 20_metic_tests.yml

* Add boolean as dimension test (skipping it for now)

* Add booleans_to_keywords and enable corresponding test

* Remove processor.event top level mapping

Reason: for metrics and logs we can rely on the name of the datastream. For spans vs. transactions there are other fields we can use.

* Remove booleans_to_keywords

Because booleans are supported now as dimension on TSDB

* Add alias service.language.name -> telemetry.sdk.language

* cleanup

* Update README.md

* Update README.md

* Update docs/changelog/111091.yaml

* Move traces@settings and traces@mappings to core

* Update traces-otel@mappings.yaml

* Review feedback

* Adapt `match` style in tests

* Update docs/changelog/111091.yaml

* Apply suggestions from code review

Co-authored-by: Vishal Raj <vishal.raj@elastic.co>

* Update x-pack/plugin/otel-data/src/main/resources/component-templates/traces-otel@mappings.yaml

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>

* Changing trace_flags to long

Related discussion: elastic#111091 (comment)

* Remove trace_flags

see: elastic/opentelemetry-dev#368 (review)

* Apply suggestions from code review

Co-authored-by: Andrew Wilkins <axwalk@gmail.com>

* Review feedback

* Add store_array_source for span links

* Define constant `data_stream.type` in `template.yaml`s

* Create package-info.java

* Move ecs-tsdb@mappings to index template

Add test to verify that @Custom template can add dynamic templates with a higher precedence

* Update metrics@mappings.json

Remove summary_gauge and summary_counter since they are covered by summary_metrics

* Move clusterService.getClusterSettings().addSettingsUpdateConsumer to registry

* Fix code-style

* Update x-pack/plugin/otel-data/src/yamlRestTest/resources/rest-api-spec/test/20_logs.tests.yml

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>

* Enable logsdb

* Update traces@settings.json

No lifecycle needed for OTel at this point

---------

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Co-authored-by: Vishal Raj <vishal.raj@elastic.co>
Co-authored-by: Andrew Wilkins <axwalk@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Felix Barnsteiner <felix.barnsteiner@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles external-contributor Pull request authored by a developer outside the Elasticsearch team >feature Team:Data Management Meta label for data/management team v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add OpenTelemetry mappings
10 participants