-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow inputs / module to define a target index. #13255
Comments
The Elasticsearch output can pick the index name from We can either require the inputs/modules to explicitely set Currently Beats index settings in the outputs allow users to configure dynamic format strings. These allow users to set the index name based on events contents (this is used to for the timestamps). If users run with ILM, then we want them to configure and use a write alias. But with standalone agent or without ILM, what should the index name be? Some options for the index setting:
Being able to configure the index name in the input means that we might have indices different from the indices provisioned via Beats index management. Do we want to forbid users to set this setting, but allow it being configured via Fleet only? Or do we want to add support to provision indices via the agent (the idxmgmt package should be usable in isolation)? |
Not sure if we should forbid the users to use it without fleet. My reasoning behind this is users can do this today using index formatted strings in the output add adding fields on each event. Currently, when a user choose to do this, they are a bit on their own for management: change templates, index patterns. Now, in the context of UI driven experience I presume the following is possible:
cc @ruflin @mattapperson @michalpristas for awareness |
What we support on the input level for index options does not require the full feature set we offer on the global one. I would be happy if we support in a first version only pure static strings only inside the input. I would hope this simplifies the implementation. |
Implementation wise one or the other is no big difference, if at all. The question is more like: what do we want allow users to configure. With static strings we have to expect either a write_alias always, ILM always, or some other entity creating daily indices if ILM is not enabled. Just having a constant string means, we don't know, but we don't allow to publish to daily indices directly via Beats. We just hope for the best. |
Also, I am looking at the problem with the checks we want to add to the agent on startups, if we make the index completely dynamic it would be near impossible to know if an indice is created upfront before sending any data to ES. @ruflin I am sure that we would want a version in the index name no? |
I think adding a version to the index name is the responsibility of the user or system that configures it. So if we want Right now the format of an index heavily depends on the version of the Beat. As in the future it's only inputs and config files which mostly dictate the format, the beat version becomes almost irrelevant. |
Just met with @ruflin about this It sounds like the API need is for a single static string per input, which is easy to implement but has the risk of being confusing if people expect all the features that come with other index settings, or for it to interact nicely with other settings instead of just overriding them. I like the suggestion Steffen mentioned of setting the agent version or some similar suffix rather than the full index string, which would be more broadly useful without growing the configuration complexity too much, but as I understand it from @ruflin that would not address the API need. So I think we should go with a simple static string override, but make sure we frame the configuration so as not to overpromise what it does. One suggestion was to name the configuration field |
I agree with only supporting the static use case so we effectively limit all the corner case and also +1 on making it an explicit name, I don't have anything better than |
Configuring an index per input will not just be some features to make fleet/integrations happy. It will also be a feature available to the standalone Beat itself, that users will very likely use. For users coming from output.elasticsearch.index, it will be somewhat funny to learn the limitation of the index setting on the input (assuming they even look at the docs). With a static name we force users to always use write aliases + ILM or do "manual" alias updates. This is why at minimum I would like to have support us to set the timestamp. -1 on introducing There are 4 processes operating on the per-input-index setting. The actual Beat, the agent, fleet, and integrations. Even if the setting is somewhat powerful in the Beat itself (name + optional date), agent, fleet and integrations still might impose some limitations before passing it down to the actual Beat. But the later applications operate in a different environment, then OSS Beats users. |
How are you imagining the setting is specified? My concern with having just an |
Talked to @urso offline about options... it sounds like the best compromise might be to make an |
the plan sound good to me. |
The last piece of this was merged this morning, the final 7.x backport should go in later today :-) |
Looking at all the PRs which also include the cherry-pick to 7.x we can close this meta issue. |
It seems like the raw_index metadata field currently only gets set if the index name is overridden by the input/module. But if it's not, raw_index is unset. Is it possible to set a default value, or at least have a hardcoded default like the usual "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"? This would greatly simplify my Logstash configuration. I thought the "output.logstash.index" property might be for this purpose but it doesn't seem to affect the raw_index metadata field. Thanks, Shawn |
We are currently working on the agent and we require that inputs (or modules depending on the beats) allow a user to define to which index the fetched data must be sent to.
So the high-level requirement is to add a new setting field that the users can specify the index they want to target and the elasticsearch output should take care of reading that value and route the events to the appropriate destination.
I think the configuration for Metricbeat and Filebeat would look like this:
Tasks:
Add support in libbeat to select the exact index if set on an event [Filebeat] Select output index based on the source input #14010
Add per input index setting to Filebeat [Filebeat] Select output index based on the source input #14010
Add per module index setting to Metricbeat [metricbeat] [auditbeat] Add formatted index option to metricbeat / auditbeat modules #15100
Add per module index setting to Auditbeat (uses Metricbeat framework) [metricbeat] [auditbeat] Add formatted index option to metricbeat / auditbeat modules #15100
Add input index setting to Winlogbeat [winlogbeat] Add formatted index setting to event log configs #15198
Add input index setting to Journalbeat Journalbeat: add index option to input #15071
Add function index setting to Functionbeat Functionbeat: add index option to function configuration #15101
The text was updated successfully, but these errors were encountered: