Investigate if it is possible to prevent inputs with non-unique IDs from being templated by autodiscovery #41881

cmacknz · 2024-12-03T21:03:30Z

When using an autodiscovery provider with Filebeat, investigate what happens when the variable substitution would result in two filestream inputs with a non-unique ID. If Filebeat allows the creation of multiple filestream inputs with the same ID, propose a way to prevent this from happening.

Elastic Agent forces the IDs to be unique when using the local dynamic provider, below is a simple example of an agent policy creating this situation. If the my_file variable is omitted, only a single input is created. Creating inputs with the same ID is impossible in Elastic Agent and we want the same behaviour in Filebeat.

For an example of the problem this is trying to prevent, a common mistake was setting the input ID on Kubernetes to kubernetes-container-logs-${kubernetes.pod.name} when an input per running container is used instead of the unique kubernetes-container-logs-${kubernetes.pod.name}-${kubernetes.container.id}

outputs:
  default:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    api_key: "example-key"
    preset: balanced

inputs:
  - type: filestream
    id: ${local_dynamic.my_var}
    data_stream:
      dataset: generic
    paths:
      - /var/log/${local_dynamic.my_file}
  
providers:
 local_dynamic:
   enabled: true
   items:
     - vars:
         my_var: A
         my_file: a.log
     - vars:
         my_var: A
         my_file: b.log
     - vars:
         my_var: A
         my_file: c.log

This results in the following running inputs:

./elastic-agent status --output=full
┌─ fleet
│  └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
   ├─ status: (HEALTHY) Running
   ├─ info
   │  ├─ id: 88d066e9-f099-4769-9416-f20f1792d081
   │  ├─ version: 8.15.2
   │  └─ commit: 621bbc685c2de8a2ad2c7c7f7cc1eb5fcf77d6d4
   ├─ filestream-default
   │  ├─ status: (HEALTHY) Healthy: communicating with pid '96292'
   │  ├─ filestream-default
   │  │  ├─ status: (HEALTHY) Healthy
   │  │  └─ type: OUTPUT
   │  ├─ filestream-default-A-local_dynamic-0
   │  │  ├─ status: (HEALTHY) Healthy
   │  │  └─ type: INPUT
   │  ├─ filestream-default-A-local_dynamic-1
   │  │  ├─ status: (HEALTHY) Healthy
   │  │  └─ type: INPUT
   │  └─ filestream-default-A-local_dynamic-2
   │     ├─ status: (HEALTHY) Healthy
   │     └─ type: INPUT

The text was updated successfully, but these errors were encountered:

elasticmachine · 2024-12-03T21:03:33Z

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

belimawr · 2024-12-17T20:30:13Z

I've been looking on how the Elastic-Agent implements ID uniqueness for its inputs. Every provider has to add or update its variables into the dynamicProviderState via the AddOrUpdate method. This method requires an ID, that should be unique. Later when the coordinator render the inputs, it generates the final input ID by concatenating the current input ID (after rendering) with the provider name and the ID the provider gave to its set of variables.

Example:

inputs:
  - type: filestream
    id: user-defined-ID-${local_dynamic.my_var}
    data_stream:
      dataset: generic
    paths:
      - /var/log/foo.log
  
providers:
 local_dynamic:
   enabled: true
   items:
     - vars:
         my_var: A

The local provider used the index in the items array as its ID, so once the input is rendered we have:

id: user-defined-ID-A

This gets concatenated with the provider's name and the ID it gave to the variables (0 in this case), which leaves us with the final ID:

user-defined-ID-A-local_dynamic-0

For other providers like Kubernetes or Docker, they use an UID of the element being discovered, which is always unique.

I believe we can implement a similar logic into Beats autodiscover: whenever it renders a template, the provider will provide an unique ID alongside each set of variables, whenever they're used to render an input template, this appended at the end of the input ID together with the provider name. If there was no ID defined in the input template, it's considered as being an empty string and added to the final configuration.

It is important to keep in mind this will be a breaking change for the Filestream input because the final ID will be different than what is defined in the autodiscovery configuration, which will have the side effect of re-ingesting some files when upgrading from 8.x to 9.x.

belimawr · 2024-12-17T20:32:30Z

When using an autodiscovery provider with Filebeat, investigate what happens when the variable substitution would result in two filestream inputs with a non-unique ID.

Once #41954 is merged,only the first input to be created will run, all others inputs with duplicated IDs will fail on creation.

cmacknz · 2024-12-17T20:53:04Z

I believe we can implement a similar logic into Beats autodiscover: whenever it renders a template, the provider will provide an unique ID alongside each set of variables, whenever they're used to render an input template, this appended at the end of the input ID together with the provider name. If there was no ID defined in the input template, it's considered as being an empty string and added to the final configuration.

It is important to keep in mind this will be a breaking change for the Filestream input because the final ID will be different than what is defined in the autodiscovery configuration, which will have the side effect of re-ingesting some files when upgrading from 8.x to 9.x.

Is there a way to do this only if there isn't an ID already defined for the filestream input? Having everybody's Kubernetes logs unconditionally re-ingested on upgrade to 9.0 would not be very nice. I think the "If there was no ID defined in the input template" might cover this but wanted to double check.

belimawr · 2024-12-17T22:22:19Z

Is there a way to do this only if there isn't an ID already defined for the filestream input?

Well, even if there is an filestream input defined, it could not be unique :/ The example you put in the issue description is a good one, if the Filestream input ID is defined as

id: kubernetes-container-logs-${kubernetes.pod.name}

the ID will be present but it won't be unique.

Having everybody's Kubernetes logs unconditionally re-ingested on upgrade to 9.0 would not be very nice.

I definitely agree with this.

The biggest problem is that the autodiscover does not know anything about Filestream or which inputs are running/IDs are in use. So if we modify the ID to ensure uniqueness (like Elastic-Agent does), we will fall into the re-ingestion case upon upgrade.

Implementing something only for the case where IDs are not defines, won't work. Not having an ID defined for the Filestream input leads to data duplication, this has been solved for the Kubernetes autodiscover with documentation and examples, so at this point in time, I believe most, if not all users, already have unique IDs set.

There might be a hacky way out of this by validating the input config and re-trying. Let me investigate more.

leehinman · 2024-12-17T22:38:10Z

Not sure if this helps or not, but the metrics namespace is global, so autodiscover could "check" to see if the metrics ID already exists for other "filestream" configed inputs. autodiscover would have to know that it is making a filestream id though.

belimawr · 2024-12-18T17:45:41Z

Not sure if this helps or not, but the metrics namespace is global, so autodiscover could "check" to see if the metrics ID already exists for other "filestream" configed inputs. autodiscover would have to know that it is making a filestream id though.

That could work, but it feels odd to have the metrics registry as a source of truth for currently running inputs.

Anyways, we were talking during our team meeting today and decided that #41954 is enough, we don't need to be generating unique IDs inside the autodiscover. The goal is to make sure we're not running duplicated IDs and we correctly inform the user.

belimawr · 2024-12-18T18:03:21Z

I looked how the autodiscover behaves when generating Filestream inputs with duplicated IDs and the changes from #41954.

The autodiscover will keep retrying to start inputs until all inputs are successfully started, with the default settings, this happens every 10s.

On a normal execution (all inputs are successfully started), Kubernetes events are gathered as they happen, inputs configurations are generated, however the inputs are only started every second (debounce period).

However when there is an error, we wait for 10s (10 * debouncePeriod). So having inputs that can never start will significantly increase the delay between an K8s event and the resulting input being started/stopped.

Because there is no way to know the specific inputs that failed, I can implement a logic similar to what #41954 does for inputs from inputs.d: if all errors are of type "not reloadable", then we don't try reloading. If at least one error is not of this type, then we retry with the 10*debouncePeirod delay (current behaviour).

What do you think @cmacknz? I could even add it as part of #41954, so once it is merged, everything is consistent. I'll also make sure to add a log error stating inputs won't be retried.

cmacknz · 2024-12-18T21:20:51Z

Yeah I think that makes sense, assuming the autodiscovery process will also log the "not reloadable" error that indicates filestreams are being duplicated.

cmacknz added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Dec 3, 2024

cmacknz assigned AndersonQ Dec 3, 2024

cmacknz changed the title ~~Confirm that duplicate inputs templated using autodiscover with non-unique IDs cannot be created~~ Investigate if it is possible to prevent inputs with non-unique IDs from being templated by autodiscover Dec 3, 2024

cmacknz changed the title ~~Investigate if it is possible to prevent inputs with non-unique IDs from being templated by autodiscover~~ Investigate if it is possible to prevent inputs with non-unique IDs from being templated by autodiscovery Dec 3, 2024

cmacknz mentioned this issue Dec 3, 2024

Filebeat should fail to start when multiple filestream inputs have the same input ID #40540

Closed

2 tasks

AndersonQ mentioned this issue Dec 6, 2024

Invoke m.Run on TestMain and fix test flakiness #41891

Merged

2 tasks

jlind23 assigned belimawr and unassigned AndersonQ Dec 9, 2024

belimawr mentioned this issue Dec 18, 2024

Filestream returns error when an input with duplicated ID is created #41954

Merged

8 tasks

jlind23 closed this as completed in #41954 Jan 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate if it is possible to prevent inputs with non-unique IDs from being templated by autodiscovery #41881

Investigate if it is possible to prevent inputs with non-unique IDs from being templated by autodiscovery #41881

cmacknz commented Dec 3, 2024 •

edited

Loading

elasticmachine commented Dec 3, 2024

belimawr commented Dec 17, 2024

belimawr commented Dec 17, 2024

cmacknz commented Dec 17, 2024

belimawr commented Dec 17, 2024

leehinman commented Dec 17, 2024

belimawr commented Dec 18, 2024

belimawr commented Dec 18, 2024

cmacknz commented Dec 18, 2024

Investigate if it is possible to prevent inputs with non-unique IDs from being templated by autodiscovery #41881

Investigate if it is possible to prevent inputs with non-unique IDs from being templated by autodiscovery #41881

Comments

cmacknz commented Dec 3, 2024 • edited Loading

elasticmachine commented Dec 3, 2024

belimawr commented Dec 17, 2024

belimawr commented Dec 17, 2024

cmacknz commented Dec 17, 2024

belimawr commented Dec 17, 2024

leehinman commented Dec 17, 2024

belimawr commented Dec 18, 2024

belimawr commented Dec 18, 2024

cmacknz commented Dec 18, 2024

cmacknz commented Dec 3, 2024 •

edited

Loading