Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Add support for customizing integration data streams at more levels of granularity #149484

Open
joshdover opened this issue Jan 25, 2023 · 16 comments
Assignees
Labels
enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@joshdover
Copy link
Contributor

joshdover commented Jan 25, 2023

We currently support customizing data stream settings and mappings via the @custom component templates that Fleet creates during package installation. Today, these are only supported on a per-data stream basis. This limits the ability to update settings for a group of related data streams and makes the process much more tedious.

Integration users want to be able to define mappings and index settings that get applied to different groups of indices, at different levels of granularity:

  • global (eg *)
  • per-type (eg. logs-*)
  • per-package (eg. *-nginx.*-*)
  • per-dataset (eg. logs-nginx.access-*) - we support this today
  • per-dataset-per-namespace (eg. logs-nginx.access-foo)
  • per-namespace (eg. *-*-foo)

One reason we haven't supported more levels of granularity is that we don't want to create 100s of unused component templates that clutter the UI and/or confuse users. In elastic/elasticsearch#92426 and the related PR elastic/elasticsearch#92436, Elasticsearch will add the ability for an index template to reference component templates that may not yet exist. This would allow us to create the custom component templates only on-demand when a user wants to apply a setting or mapping to given group of data streams.

For instance, this would allow us to create index templates like this during package installation:

PUT _index_template/logs-nginx.access-foo?ignore_missing=true
{
  "index_patterns": ["logs-nginx.access-*"],
  "template": {
    
  },
  "priority": 250,
  "composed_of": [
    "logs-nginx.access@package",
    "global@custom",
    "logs@custom",
    "global-foo@custom",
    "logs-nginx@custom",
    "logs-nginx.access@custom",
    ".fleet_globals-1",
    ".fleet_agent_id_verification-1"
  ],
  "ignore_missing_component_templates": [
    "global@custom",
    "logs@custom",
    "global-foo@custom",
    "logs-nginx@custom",
    "logs-nginx.access@custom",
  ]
}

Users would then be able to manually create new component templates that match the naming convention, then perform a rollover, to customize a group of data streams. We could first support this via documentation, and later add UI features on top of this to make this easier.

Note that implementing this would require that we leverage the package installation format versioning #121099 to reinstall all index templates on the next stack upgrade.

Related

@joshdover joshdover added the Team:Fleet Team label for Observability Data Collection Fleet team label Jan 25, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@joshdover
Copy link
Contributor Author

cc @hop-dev would like to get your opinions on this since you've spent a lot of time in this area

@joshdover joshdover changed the title [Fleet] Add support for customizing integration data stream at more levels of granularity [Fleet] Add support for customizing integration data streams at more levels of granularity Jan 25, 2023
@joshdover joshdover added the enhancement New value added to drive a business result label Jan 25, 2023
@hop-dev
Copy link
Contributor

hop-dev commented Jan 27, 2023

I love it! We would need the namespace specific index templates discussed in #121118 for the two namespace specific customisations to work:

per-dataset-per-namespace (eg. logs-nginx.access-foo)
per-namespace (eg. *-*-foo)

I'm not sure if your example is supposed to show a namespace specific index template @joshdover? should "index_patterns": ["logs-nginx.access-*"], be "index_patterns": ["logs-nginx.access-foo"],?

Maybe we could add the package specific customisations as a smaller win in the short term as the namespace index templates is a large task.

@joshdover
Copy link
Contributor Author

joshdover commented Jan 27, 2023

@hop-dev good catch, yes that should be adjusted (fixed now). We would still need namespace-specific templates as well.

+1 on moving all the other ones forward first. Likely a fairly low effort change since we've done these kinds of changes in the past.

@ruflin
Copy link
Member

ruflin commented Jan 30, 2023

I see how the above proposal is going to help a lot to solve it on the technical level. The part I'm now more concerned about is the user experience. A users updates a template on the integration level, will it trigger all the rollovers that are needed automatically? (just one example)

The global templates are even tricker as these can have a massive effect. Also what about the priorties / debugging? A user change a global setting but lower ones have priority, how does the user debug these / get informed about the changes?

Some related discussions happening at the moment are for users that want to use their own component template for 3 integrations (not all). With the above, we can already offer a much better solution by only having to update 3 custom templates. What if we offer just 3 options for extension by default, ignoring global. These are our recommended extension paths. But in addition, we would allow users through a Fleet API to add their own component template. Fleet would add it to the end of the list, add it to ignore_missing and manage it. In case of upgrade of a package upgrade, the added custom template would still exists because it was added through Fleet. If it is added manually, it would not stay there.

The above would allow us to have an escape hatch for expert users without breaking our experience. In case of issues, it could be easily detected that a component was added and potentially be removed again.

@P1llus
Copy link
Member

P1llus commented Jan 30, 2023

I think one thing that users would want, is that they already have custom component templates they want to reuse, not following a specific naming convention, for example let's just call it custom-ilm-component, which includes certain ILM settings they want to override for most (but not all their integrations).

During the time you add the integration to a specific policy, you usually will have an overview over custom ingest pipelines and custom component templates that are being generated for this integration.

It would be super to have a similar UI field to add custom components that already exists, maybe having a drop-down of available custom components (only listing the ones not managed by fleet to prevent large amount of choice?)

@ruflin
Copy link
Member

ruflin commented Jan 31, 2023

elastic/elasticsearch#92436 just go merged which should create a foundation for this general feature.

@felixbarny
Copy link
Member

I've created an Elasticsearch issue for this: elastic/elasticsearch#97664. I'd like to propose closing this issue in favor of the Elasticsearch issue as I think this feature shouldn't be exclusive to Fleet. More on the reasoning about that in the issue.

@joshdover
Copy link
Contributor Author

We still need to make changes in Fleet to use the new templates component template names if/when they get support in ES. Let's keep this one open.

@joshdover
Copy link
Contributor Author

Assigning @strawgate while they work on prioritization and discussions with the Elasticsearch team on scoping out a solution.

@mbudge
Copy link

mbudge commented Jan 23, 2024

Can you put global@custom at the top so we can override @Package?

"global@custom",
logs-nginx.access@package",
"logs@custom",
"global-foo@custom",
"logs-nginx@custom",
"logs-nginx.access@custom",
".fleet_globals-1",
".fleet_agent_id_verification-1

We want to add lowercase normaliser to case sensitive keyword fields users often search, but miss results because they used the wrong case.

One of the main fields in host.name which can be hostname0001 or HOSTNAME0001 or Hostname0001.

We want to apply the lowercase normaliser across all the logs data streams so users don't need to worry about this.

Elastic already has a steep learning curve so it will make the platform easier to use.

@herrBez
Copy link
Contributor

herrBez commented May 6, 2024

Hi there,

Would it make sense to split the issue and distinguish the "namespace" case (which is more difficult to support) and the global@custom, type@custom, type-dataset@custom case? It would be also more similar to what we do with ingest pipelines starting with release 8.12.

@crocswithsocks
Copy link

Very much looking forward to this feature being added to fleet managed index templates. In my use case I need to enable the _size mapping on every index in the cluster. This is a painful task as it requires modifying every integration's @Custom component templates to include this setting. Utilizing the global@custom component template would greatly decrease the amount of time spent modifying each integration's component templates!

@bgebelek
Copy link

bgebelek commented Jul 17, 2024

I agree with @crocswithsocks. Having this feature be added will make modifying templates much easier.

@mbudge
Copy link

mbudge commented Aug 6, 2024

Please put logs@custom/global@custom above the integration @Package so we can add the lowercase normaliser to the following fields as a minimum

host.name
user.name
user.target.name

There's about 20-30 more ecs fields I want to add the lowercase normaliser too.

Our users are continuously missing security logs because it's impossible to find all the hits using KQL when the case is mixed. I know ESQL has a lowercase function, but most users still use KQL in different parts of the platform.

I've added the lowercase processor to the logs@custom ingest pipeline, but the best way to do this is add the lowercase normaliser to the mappings.

@nimarezainia
Copy link
Contributor

#190730

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

No branches or pull requests