-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Potential breaking change with APM data streams (maybe others) and Fleet ingest pipeline customization hooks #175254
Comments
Pinging @elastic/fleet (Team:Fleet) |
Our priorities for this are as follows right now:
I'm updating the task list in the description to reflect these next steps. |
I pulled all of the APM datasets from the integration source:
I believe the only collisions possible here are
As Next, I expanded my search to all integration data streams defined in https://github.com/elastic/integrations. Here's the full set of all integration data streams: Show list
From what I understand, the only way a collision like this is possible is when an integration data stream defines a custom The list of data streams that explicitly define Show list
A quick manual glance through these data streams reveals some additional cases where there's a collision issue
|
@kpollich I did a similar thing where I tried to install all the packages and logged all packages with dataset being the same as the package name I got the following list:
I think the idea solution here would have been to have something like this for dataset (but we introduce the package
|
Thanks, Nicolas - that list aligns with my findings, but now I realize that only cases where the dataset begins with the package name value` will result in collisions - so I think the actual conflicting data sets are limited to the APM traces data streams and the Elastic Agent logs data streams above. For instance, here's what the synthetics ingest pipelines in question look like on 8.12.0
There's no collisions here as However, I think the
If a user had added a custom ingest pipeline for
This makes sense to me, so we'd have this for the APM data streams in question instead of what we have now:
I think it's a little confusing just because the naming is not super clear here on the integration side, but perhaps we can correct that by adding a dynamic
|
Adding the package name to the dataset custom will be a breaking change, so we may want to have a way to opt-in, with a config flag? |
That's a good point, @nchaulet. What if we leave the Technically there is still room for the same kind of breaking change between 8.12.0 and 8.12.1 if we take this path, but I think the scope is narrow enough that it would be okay. The remediation will be to just rename your pipelines which should be manageable for users I think. |
Yes I think it could work so you will have for apm.rum
and for apm (still need to have some deduplication implement right?)
|
Playing around with an implementation for a fix here and making good progress. The naming is a bit wonky and sort of diverges from the data stream naming convention which I feel is not totally ideal, e.g.
Or, for some more prudent APM ingest pipelines:
I think the naming is a little clunky, but hopefully it's not too confusing with the description in place. Adding the package name as part of the expected pipeline name seems to be our only path to preventing collisions. |
Yes the name is a little off we the naming discussion that happened here and a little different from what we have for component template too discussion here , it may be confusing for user (maybe worth getting @felixbarny thoughts here) Thinking loud here could we have a prefix like apm.rum
apm
Not sure this could happen without a breaking change |
from #175254 (comment)
@kpollich @nchaulet why would you deprecate |
I agree that this wouldn't comply with the new naming conventions we've established in elastic/elasticsearch#96267 and it'll probably also be confusing to the user as to which data streams these custom pipelines apply to. Therefore, and because they have been around for longer, I'd bias towards not renaming the custom pipelines for a dataset. We could declare the names of the new extension points bogus and rename them in a breaking manner. For example:
I don't think that a suffix like Or we can keep those around as deprecated that don't cause a conflict. There's also a new |
@simitt - I don't think I agree with this assessment as far as which pipeline pattern is intended to appear here. We want to support a pattern like So, as far as the Fleet implementation is concerned, the expected behavior is that There are real world use cases for this customization, e.g. decorating all To be clear, the list of pipeline processors that appear for all integration as of 8.12.0 aligned with their "patterns" are as follows:
We would deprecate So, the fix we're proposing above is to deprecate the However, @felixbarny's suggestion is more feasible, e.g. this point rings true:
I'm in agreement with this, so a path forward would be to rename the newer With this mind, I'm proposing we do the following
The |
There's still technically an edge case with the new
We'd have a datastream pattern of
You could also footgun yourself by providing a custom
So maybe we have no choice but to add a restriction on dataset naming to the package spec? The collision case here is, I think, less likely than the current implementation but it still exists. |
That sounds reasonable to me. |
I filed elastic/package-spec#699 to capture the package spec change proposed above. |
That is exactly what I see as the problem and what is breaking the apm use case; +1 on finding a solution where no deprecation of pre Regarding the proposed solution by @felixbarny and @kpollich, can you clarify how that would look like for the apm case?
Would that ultimately lead to the following?
and
|
@simitt - This is close, but the Here's what the pipeline processors on these data streams look like on my PR branch - #175448
|
FYI with the We could use |
#175448 has been updated to use @simitt - I'll hold off on merging until you can take a look at the above and verify this is acceptable from the APM side. |
In the new apm-data Elasticsearch plugin we have the following logic: https://github.com/elastic/elasticsearch/blob/9b4647cfc6d39987cc3fd4f44514bca403d4808f/x-pack/plugin/apm-data/src/main/resources/ingest-pipelines/apm%40default-pipeline.yaml#L35-L56 That is, we invoke:
So IIUC we should replace that third one with |
@simitt and I discussed this just now, and rather than making it consistent I'm going to remove that custom pipeline from the apm-data plugin. The reason is that "integrations" and "packages" no longer make sense, conceptually, when taking Fleet or integrations out of the picture. |
@kpollich your proposal looks good from an apm perspective - thanks for finding a non-breaking solution. |
…ns + add descriptions to each pipeline (#175448) ## Summary Closes #175254 Ref #168019 Ref #170270 In 8.12.0, Fleet unintentionally shipped a breaking change in #170270 for APM users who make use of a custom `traces-apm` data stream. If a user had previously defined this ingest pipeline to customize documents ingested for the `traces-apm` data stream (defined [here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3), then they would unexpectedly see that pipeline called when documents were ingested to the `traces-apm.rum` and `traces-apm.sampled` datastreams as well. This PR addresses this collision by adding a `.package` suffix to the "package level" ingest pipeline introduced in 8.12.0. So, in 8.12.0 a processor would be defined as such on the `traces-apm.rum` or `traces-apm.sampled` ingest pipeline ``` { "pipeline": { "name": "traces-apm@custom", "ignore_missing_pipeline": true, } }, ``` This PR replaces the pipeline with one that looks as follows: ``` { "pipeline": { "name": "traces-apm.package@custom", "ignore_missing_pipeline": true, "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration" } }, ``` **To be clear: this is a breaking change if you have defined the `traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be called for documents ingested to the `traces-apm`, `traces-apm.rum`, or `traces-apm.sampled` data streams. You will need to rename your pipeline to `traces-apm.package@custom` to preserve this behavior.** This change also applies to `logs-elastic_agent.*` ingest pipelines. See [this comment](#175254 (comment)) for more information. There is still technically room for a collision, though it's unlikely, if the data stream name is `package`. This will be handled by a package spec validation proposed in elastic/package-spec#699. --------- Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
…ns + add descriptions to each pipeline (elastic#175448) ## Summary Closes elastic#175254 Ref elastic#168019 Ref elastic#170270 In 8.12.0, Fleet unintentionally shipped a breaking change in elastic#170270 for APM users who make use of a custom `traces-apm` data stream. If a user had previously defined this ingest pipeline to customize documents ingested for the `traces-apm` data stream (defined [here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3), then they would unexpectedly see that pipeline called when documents were ingested to the `traces-apm.rum` and `traces-apm.sampled` datastreams as well. This PR addresses this collision by adding a `.package` suffix to the "package level" ingest pipeline introduced in 8.12.0. So, in 8.12.0 a processor would be defined as such on the `traces-apm.rum` or `traces-apm.sampled` ingest pipeline ``` { "pipeline": { "name": "traces-apm@custom", "ignore_missing_pipeline": true, } }, ``` This PR replaces the pipeline with one that looks as follows: ``` { "pipeline": { "name": "traces-apm.package@custom", "ignore_missing_pipeline": true, "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration" } }, ``` **To be clear: this is a breaking change if you have defined the `traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be called for documents ingested to the `traces-apm`, `traces-apm.rum`, or `traces-apm.sampled` data streams. You will need to rename your pipeline to `traces-apm.package@custom` to preserve this behavior.** This change also applies to `logs-elastic_agent.*` ingest pipelines. See [this comment](elastic#175254 (comment)) for more information. There is still technically room for a collision, though it's unlikely, if the data stream name is `package`. This will be handled by a package spec validation proposed in elastic/package-spec#699. --------- Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> (cherry picked from commit 9fe5a66)
…oid collisions + add descriptions to each pipeline (#175448) (#175547) # Backport This will backport the following commits from `main` to `8.12`: - [[Fleet] Update Fleet's custom ingest pipeline names to avoid collisions + add descriptions to each pipeline (#175448)](#175448) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Kyle Pollich","email":"kyle.pollich@elastic.co"},"sourceCommit":{"committedDate":"2024-01-25T14:07:43Z","message":"[Fleet] Update Fleet's custom ingest pipeline names to avoid collisions + add descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/175254\r\nRef https://github.com/elastic/kibana/issues/168019\r\nRef https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet unintentionally shipped a breaking change in\r\nhttps://github.com//pull/170270 for APM users who make use\r\nof a custom `traces-apm` data stream. If a user had previously defined\r\nthis ingest pipeline to customize documents ingested for the\r\n`traces-apm` data stream (defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen they would unexpectedly see that pipeline called when documents\r\nwere ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams as well.\r\n\r\nThis PR addresses this collision by adding a `.package` suffix to the\r\n\"package level\" ingest pipeline introduced in 8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\": \"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n }\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\": \"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration\"\r\n }\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be\r\ncalled for documents ingested to the `traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams. You will need to rename your pipeline\r\nto `traces-apm.package@custom` to preserve this behavior.**\r\n\r\nThis change also applies to `logs-elastic_agent.*` ingest pipelines. See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor more information.\r\n\r\nThere is still technically room for a collision, though it's unlikely,\r\nif the data stream name is `package`. This will be handled by a package\r\nspec validation proposed in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:breaking","Team:Fleet","backport:prev-minor","v8.12.1","v8.13.0"],"title":"[Fleet] Update Fleet's custom ingest pipeline names to avoid collisions + add descriptions to each pipeline","number":175448,"url":"https://github.com/elastic/kibana/pull/175448","mergeCommit":{"message":"[Fleet] Update Fleet's custom ingest pipeline names to avoid collisions + add descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/175254\r\nRef https://github.com/elastic/kibana/issues/168019\r\nRef https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet unintentionally shipped a breaking change in\r\nhttps://github.com//pull/170270 for APM users who make use\r\nof a custom `traces-apm` data stream. If a user had previously defined\r\nthis ingest pipeline to customize documents ingested for the\r\n`traces-apm` data stream (defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen they would unexpectedly see that pipeline called when documents\r\nwere ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams as well.\r\n\r\nThis PR addresses this collision by adding a `.package` suffix to the\r\n\"package level\" ingest pipeline introduced in 8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\": \"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n }\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\": \"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration\"\r\n }\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be\r\ncalled for documents ingested to the `traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams. You will need to rename your pipeline\r\nto `traces-apm.package@custom` to preserve this behavior.**\r\n\r\nThis change also applies to `logs-elastic_agent.*` ingest pipelines. See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor more information.\r\n\r\nThere is still technically room for a collision, though it's unlikely,\r\nif the data stream name is `package`. This will be handled by a package\r\nspec validation proposed in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"8.12","label":"v8.12.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.13.0","branchLabelMappingKey":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/175448","number":175448,"mergeCommit":{"message":"[Fleet] Update Fleet's custom ingest pipeline names to avoid collisions + add descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/175254\r\nRef https://github.com/elastic/kibana/issues/168019\r\nRef https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet unintentionally shipped a breaking change in\r\nhttps://github.com//pull/170270 for APM users who make use\r\nof a custom `traces-apm` data stream. If a user had previously defined\r\nthis ingest pipeline to customize documents ingested for the\r\n`traces-apm` data stream (defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen they would unexpectedly see that pipeline called when documents\r\nwere ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams as well.\r\n\r\nThis PR addresses this collision by adding a `.package` suffix to the\r\n\"package level\" ingest pipeline introduced in 8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\": \"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n }\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\": \"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration\"\r\n }\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be\r\ncalled for documents ingested to the `traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams. You will need to rename your pipeline\r\nto `traces-apm.package@custom` to preserve this behavior.**\r\n\r\nThis change also applies to `logs-elastic_agent.*` ingest pipelines. See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor more information.\r\n\r\nThere is still technically room for a collision, though it's unlikely,\r\nif the data stream name is `package`. This will be handled by a package\r\nspec validation proposed in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d"}}]}] BACKPORT--> Co-authored-by: Kyle Pollich <kyle.pollich@elastic.co>
@kilfoyle - FYI now that this has landed, I'm going to open a docs issue later today with a draft of what we should include under the breaking change section of the 8.12.1 release notes. |
@kpollich Sounds good. Thanks so much for writing that up! |
Docs issue: elastic/ingest-docs#861 |
Stop invoking the '<data_stream.type>-apm@custom' ingest pipeline (if it exists). This was done to align with what Fleet was doing, which turned out to be flawed due to naming collisions. e.g. this would mean that the data stream `traces-apm.rum-default` would invoke `traces-apm@custom`, which would also match the pipeline intended to be invoked only for the data stream `traces-apm-default`. The Fleet solution was to rename the convention to `<data_stream.type>-<package>.integration@custom`. This does not make sense for the apm-data plugin, and we see no use case for this anyway. The behaviour was added in 8.12.1 and has not been documented, so this is non-breaking. See also elastic/kibana#175254 (comment)
@amolnater-qasource - FYI we updated the names of these pipelines. Not sure if this impacts existing test cases but I wanted to flag this issue to you. See relevant PR + documentation issue above as well. |
Hi @kpollich Thank you for the update. We have updated 02 testcases for this feature under testrail at links:
We have validated this issue on 8.13.0-SNAPSHOT Kibana build and and had below observations: Observations:
Build details: Screen-Cast: Further we will revalidate this once latest 8.12.1 BC build is available. Please let us know if we are missing anything here. |
Hi Team, We have revalidated these changes on latest 8.12.1 BC1 kibana cloud environment and found it working fine now. Observations:
Build details: Hence we are marking this as QA:Validated. Please let us know if anything else is required from our end. |
Testing notes (from APM)8.12.1 fix is working as expected. I confirm the following ingest pipelines do not exhibit the bug in 8.12.0.
8.12.0
[
{
"rename": {
"field": "observer.id",
"target_field": "agent.ephemeral_id",
"ignore_missing": true
}
},
{
"date": {
"field": "_ingest.timestamp",
"formats": [
"ISO8601"
],
"ignore_failure": true,
"output_format": "date_time_no_millis",
"target_field": "event.ingested"
}
},
{
"pipeline": {
"name": "global@custom",
"ignore_missing_pipeline": true
}
},
{
"pipeline": {
"name": "traces@custom",
"ignore_missing_pipeline": true
}
},
{
"pipeline": {
"name": "traces-apm@custom",
"ignore_missing_pipeline": true
}
},
{
"pipeline": {
"name": "traces-apm.sampled@custom",
"ignore_missing_pipeline": true
}
}
] 8.12.1
[
{
"rename": {
"field": "observer.id",
"target_field": "agent.ephemeral_id",
"ignore_missing": true
}
},
{
"date": {
"field": "_ingest.timestamp",
"formats": [
"ISO8601"
],
"ignore_failure": true,
"output_format": "date_time_no_millis",
"target_field": "event.ingested"
}
},
{
"pipeline": {
"name": "global@custom",
"ignore_missing_pipeline": true,
"description": "[Fleet] Global pipeline for all data streams"
}
},
{
"pipeline": {
"name": "traces@custom",
"ignore_missing_pipeline": true,
"description": "[Fleet] Pipeline for all data streams of type `traces`"
}
},
{
"pipeline": {
"name": "traces-apm.integration@custom",
"ignore_missing_pipeline": true,
"description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration"
}
},
{
"pipeline": {
"name": "traces-apm.sampled@custom",
"ignore_missing_pipeline": true,
"description": "[Fleet] Pipeline for the `apm.sampled` dataset"
}
}
] |
* changelog: add fleet ingest pipeline breaking change See elastic/kibana#175254 Part of #12498
* changelog: add fleet ingest pipeline breaking change See elastic/kibana#175254 Part of #12498 (cherry picked from commit 6ed334a) # Conflicts: # changelogs/8.12.asciidoc
…12556) (#12567) * changelog: add fleet ingest pipeline breaking change (#12556) * changelog: add fleet ingest pipeline breaking change See elastic/kibana#175254 Part of #12498 (cherry picked from commit 6ed334a) # Conflicts: # changelogs/8.12.asciidoc * Fix conflict --------- Co-authored-by: Carson Ip <carsonip@users.noreply.github.com> Co-authored-by: Carson Ip <carson.ip@elastic.co> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ns + add descriptions to each pipeline (elastic#175448) ## Summary Closes elastic#175254 Ref elastic#168019 Ref elastic#170270 In 8.12.0, Fleet unintentionally shipped a breaking change in elastic#170270 for APM users who make use of a custom `traces-apm` data stream. If a user had previously defined this ingest pipeline to customize documents ingested for the `traces-apm` data stream (defined [here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3), then they would unexpectedly see that pipeline called when documents were ingested to the `traces-apm.rum` and `traces-apm.sampled` datastreams as well. This PR addresses this collision by adding a `.package` suffix to the "package level" ingest pipeline introduced in 8.12.0. So, in 8.12.0 a processor would be defined as such on the `traces-apm.rum` or `traces-apm.sampled` ingest pipeline ``` { "pipeline": { "name": "traces-apm@custom", "ignore_missing_pipeline": true, } }, ``` This PR replaces the pipeline with one that looks as follows: ``` { "pipeline": { "name": "traces-apm.package@custom", "ignore_missing_pipeline": true, "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration" } }, ``` **To be clear: this is a breaking change if you have defined the `traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be called for documents ingested to the `traces-apm`, `traces-apm.rum`, or `traces-apm.sampled` data streams. You will need to rename your pipeline to `traces-apm.package@custom` to preserve this behavior.** This change also applies to `logs-elastic_agent.*` ingest pipelines. See [this comment](elastic#175254 (comment)) for more information. There is still technically room for a collision, though it's unlikely, if the data stream name is `package`. This will be handled by a package spec validation proposed in elastic/package-spec#699. --------- Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Summary
In 8.12.0, Fleet introduced new extension points for ingest pipeline customization in the form of additional
pipeline
processors in Fleet-managed ingest pipelines:global@custom
${type}@custom
e.g.logs@custom
${type}-${package}@custom
e.g.logs-nginx@custom
These new extension points allow for more granular customization of ingestion for various use cases, for instance applying global processing across all
logs
data streams.The existing extension point of the pattern
${type}-${dataset}@custom
e.g.logs-apache.logs-my_namespace@custom
is preserved, and is called as the lastpipeline
processor in each Fleet-managed ingest pipeline.Problem 1 - Duplicate pipeline processors
APM defines a
traces-apm
data stream hereBecause the package name
apm
is the same as the datasetapm
, Fleet creates a duplicatepipeline
processor in the final ingest pipeline for this data stream, e.g.In the example above, the first
traces-apm@custom
processor is of the form${type}-${package}@custom
while the second is of the form${type}-${dataset}@custom
. This duplication should be avoided.Problem 2 - Breaking change for
traces-apm.sampled
data streamAPM also defines a
traces-apm.sample
data stream here. Because this data stream extends on thetraces-apm
data stream's name, Fleet's customization hooks introduces a breaking change to its processing scheme.For example, prior to 8.12.0, the
traces-apm.sampled-X.Y.Z
ingest pipeline would have the following pipeline processor defined:Following 8.12.0, this pipeline will now have these processors defined:
The problem (highlighted in the code block above) is that the
traces-apm@custom
processor, which is intended to be of the form${type}-${package}@custom
overlaps with thetraces-apm@custom
pipeline defined for thetraces-apm
data stream above, which is intended to be of the form${type}-${dataset}@custom
. This is technically the same problem as Problem 1 above, but it manifests in a potential breaking change for APM users who have customized their ingest scheme.If an APM user is relying on customizations they made to the
traces-apm@custom
ingest pipeline (which was set up by default in release prior to 8.12.0), they will now unexpectedly see that pipeline firing on data ingested to thetraces-apm.sampled
data stream. This is a breaking change and should be communicated as such.With both problems above, we likely need some kind of additional specificity to avoid the case where a dataset name overlaps with a package name, as is the case with APM. It'd be great to query the integrations repo to see if we can detect other places where this may be the case and alert those teams.
In the immediate term, we need to communicate this as a known issue + breaking change to our users by adding documentation and updating our 8.12.0 release notes. Following that, let's try to come to a decision quickly on how we can fix the root issue with the duplication/lack of specificity.
cc @simitt @lucabelluccini @nchaulet @kilfoyle
Tasks
The text was updated successfully, but these errors were encountered: