-
Notifications
You must be signed in to change notification settings - Fork 449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Meta] Migrate to ecs@mappings template shipped with Elasticsearch #8542
Comments
I'm proposing to enable this not only on logs data streams but all data streams for integrations. |
+1 to enable this for all data streams. Maybe we can split this in two phases, but I think it would be confusing to have ECS mappings imported automatically in some data streams but not others.
We already have some code that loads ECS mappings for validation of documents in tests. It is loaded now under certain conditions, basically for packages that use |
Sure we can do other data streams as well. Metrics seems obvious, but should we do the same for traces? cc @axw
Makes sense to me 👍 There were no major concerns raised from the email that was sent out and the TTL has passed. @jsoriano @jen-huang I think we can move this into the Fleet & Integrations backlog for prioritization and scoping for step 1. Further steps are blocked by Elasticsearch changes at this time. |
@joshdover @ruflin : I agree this should be done for metrics datastream as well. |
We have done this in the new Elasticsearch apm-data plugin: https://github.com/elastic/elasticsearch/blob/9551fcef4024dc6bfe63ff6e60ff63652306a10c/x-pack/plugin/apm-data/src/main/resources/index-templates/traces-apm%40template.yaml#L18 That's just for I don't think there's any point in changing Fleet to do this for |
I don't think that we'll add dimension definition to |
I don't know the code that would do it, but if we end up with
There is a third option here that we potentially introduce an additional component template with the most common dimensions. This component template is then linked for tsdb indices by Fleet. As soon as elastic/elasticsearch#98384 lands, this becomes obsolete. @ishleenk17 It might be worth to create a separate Github issue to discuss this in more details and also link it in the description. |
Fair. It shouldn't make a functional difference if it is added, so if it makes the code simpler then that's fine. |
I have created this ticket for tracking and added to description as well. |
@joshdover - As this changes is going to remove the dependency of updating ECS versions.
|
@muthu-mps this hasn't been implemented yet and we're not ready for integrations to yet move to this pattern. It will also require bumping the |
## Summary I added a reference to the component template `ecs@mappings` when building any index template for an integration. The same behaviour is valid for both logs and metrics index templates. Linked to elastic/integrations#8542 Close #174905 ### Checklist Delete any items that are not applicable to this PR. - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: Kyle Pollich <kyle.pollich@elastic.co> Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
Question for folks involved in the effort here: which group or team owns keeping the |
We've set up an automated test that checks that the |
Great. It looks like this test pulls from ECS main, generates fake data based on the declared type for each field, then I assume verifies that it gets the right mapping. 👍 |
## Summary I added a reference to the component template `ecs@mappings` when building any index template for an integration. The same behaviour is valid for both logs and metrics index templates. Linked to elastic/integrations#8542 Close elastic#174905 ### Checklist Delete any items that are not applicable to this PR. - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: Kyle Pollich <kyle.pollich@elastic.co> Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
I would like to provide a summary of what's the current status:
Hey @felixbarny, @ruflin , @zmoog, can you please let me know if I missed anything or if I got something wrong |
A couple of questions still left:
|
|
In essence, there should be a flag for integrations to opt-in or opt-out of the generation of the |
@felixbarny I might have missed the discussion about introducing a flag to opt-in or opt-out of the While I would generally agree to introduce a flag to be on the safe side, it would make the implementation a bit more complicated, it won't solve the problem of adding the dynamic fields from ecs@mappings, and it will not be aligned with the default behaviour in Elasticsearch. Also, while looking at the code, I noticed a hard limit of 1024 fields in Kibana that I wasn't aware of. This is a static value that doesn't depend on For all those reasons, I would lean toward removing the field calculation altogether from the Kibana code. |
Interesting find in the Kibana code. Does it mean that even if integrations have more then 1024 fields, these were never added, at least on the query side? If yes, I agree we can just go with the new setting. There is still the total field limit meaning that even though by default users didn't search on these fields, these were still indexed up to 10k fields. I would like to see this field limit also be changed to 1024 (aka removed) by default and if an integration needs more fields, overwrites it. I wonder if this is already possible today through the Elasticsearch settings in the dataset so no additional changes on our end would be needed. This would also allow users to overwrite dynamic field limit if needed. |
@ruflin , I was surprised as well but that seems to be the behaviour. Truncate the default fields to 1024 no matter what. About lowering the total field limit I have created Lower field limit in integrations #177447 |
The limit in Fleet is in place to avoid hitting the max clause limit in Elasticsearch. The max clause limit depends on the hardware configuration the ES nodes are running on and it's not impacted by the max total fields limit. I'd also really like to get rid of the default_field generation in Fleet but there's a risk that for some integrations (like input integrations with an unknown number of fields), this could lead to hitting the max clause limit in ES. Maybe the risk is small enough, though. If we thing the risk is too high to be tolerable in a minor version, we should only remove setting the default_field in Fleet for integrations where we're also lowering the total field limit. |
## Summary I added a reference to the component template `ecs@mappings` when building any index template for an integration. The same behaviour is valid for both logs and metrics index templates. Linked to elastic/integrations#8542 Close elastic#174905 ### Checklist Delete any items that are not applicable to this PR. - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: Kyle Pollich <kyle.pollich@elastic.co> Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
@gsantoro @felixbarny In summary, this might imply that there is no change for integrations regarding ECS mappings; we will continue to include them as we currently do? |
@zmoog will provide documentation shortly and also run a session on the migration details. |
Background
tl;dr - we plan to change all integrations index templates to reference the ecs@mappings component template as a fallback for any ECS fields that are not explicitly mapped by integrations.
As part of the Logs+ project, the Observability team has been improving the Stack's ability to make logs as searchable as possible out of the box. Part of this effort has included introducing a new ecs@mappings component template included in Elasticsearch by default. This template dynamically maps all ECS fields with appropriate mappings if more specific mappings are not defined by other component templates. Today, these default ECS mappings are applied to any
logs-*-*
data stream that uses the defaultlogs-*-*
index template, and is not in use by our integration data streams.We want to bring this improvement to integration logs data streams to solve these problems:
By ensuring that all ECS-compatible fields get indexed and are searchable by default and with a single source of truth for the mapping definitions, we can solve these problems for users and integration developers by not requiring them to ever explicitly map ECS fields. Integration developers also will not need to update ECS versions in their packages anymore, relying the ecs@mappings template to always be up to date and in sync with the Stack version.
Phases
Step 1: Enable ecs@mappings on all logs data streams
The goal of this step is to:
The proposal is to enable this by default for all integrations logs data streams managed by Fleet starting in some stack version (depending on when we land the change). On this Stack upgrade, existing data stream index templates will be automatically updated by Fleet to start referencing the ecs@mappings template, at lower precedence than the
@package
and@custom
component templates.We considered a more cautious approach and making this opt-in by each integration, but are leaning towards not doing this for the following reasons:
@custom
template will override the ecs@mappings and continue to be mapped the same as beforeStep 2: Add support for index.query.default_fields on fields defined by ecs@mappings
In order to support migrating existing integrations to the new ecs@mappings fields, we need to change how Fleet installs integrations and configured the
index.query.default_fields
setting so that searches in Kibana that don't specify a field name also will search against fields mapped by ecs@mappings.Today, Fleet will automatically populate the
index.query.default_fields
setting to the list of fields explicitly defined by an integration's field definitions. This needs to change to allow the default again ofdefault_fields: ["*"]
without introducing maxClauseCount errors, see discussion and possible solutions here: elastic/elasticsearch#102378Following these Elasticsearch improvements, we will need to introduce package-spec and Fleet changes to change how we specify the default_field setting. This should probably be opt-in to give integration developers a chance to test their changes, but we should consider whether or not this is necessary if there are no breaking changes in the ES change that gets introduced.
Step 3: Enable ECS system tests in elastic-package
We should enable integration developers to run a standard set of system tests against their data streams with documents of ECS data to ensure that any mapping overrides of ECS fields specified by the integration are always compatible with ECS.
@P1llus can you link to our existing test suite for the legacy dynamic templates?
Step 4: Remove explicit ECS mappings from existing integrations
At this point, existing integrations can start removing explicit ECS mappings and rely on the source of truth ecs@mappings template shipped with the Stack version.
For metrics datastream we need to resolve #8623 before removing ecs.yml
Tasks
default_fields
in Fleet kibana#177605The text was updated successfully, but these errors were encountered: