Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuration of Agent(+)Beats Internal queue (on disk queue) #284

Closed
zez3 opened this issue Mar 22, 2022 · 19 comments
Closed

Allow configuration of Agent(+)Beats Internal queue (on disk queue) #284

zez3 opened this issue Mar 22, 2022 · 19 comments
Labels
enhancement New feature or request

Comments

@zez3
Copy link

zez3 commented Mar 22, 2022

Describe the enhancement:
Enable the options to configure from Fleet (policy) the Internal Queue of my Agents and/or underlying Beats.

At the moment this is only available for standalone Agents.
https://www.elastic.co/guide/en/beats/filebeat/current/configuring-internal-queue.html

Describe a specific use case for the enhancement or feature:

When an Managed Agent dies or gets restarted we loose messages that where not yet consumed by ElasticSearch.
We would like to have the option to increase the buffer memory or since our disks are NVMe and future ones will be able to achieve even greater IO speeds we want to use the Internal disk queue to make sure that we do not miss messages.

@zez3
Copy link
Author

zez3 commented Mar 22, 2022

@joshdover

@zez3
Copy link
Author

zez3 commented Mar 22, 2022

elastic/kibana#81353

This issue was already described under Kibana issues

@joshdover
Copy link
Contributor

@nimarezainia Would this fall under the "input settings" effort?

@ph
Copy link
Contributor

ph commented Apr 1, 2022

I am going to move this to the Elastic-Agent repository

@ph ph transferred this issue from elastic/fleet-server Apr 1, 2022
@ph ph added the enhancement New feature or request label Apr 1, 2022
@nimarezainia
Copy link
Contributor

nimarezainia commented Apr 1, 2022

@joshdover @zez3 this can't currently be done efficiently for it to make sense. These internal queues are a per beat concept and more closely aligned with writing to an output. Many of these parameters are there for users to modify the throughput they get from the beat. - closely aligned with the output queue parameters like bulk_max_size.

More thought is going into this with the new shippers work and we will provide easier means for the users to manage throughput from the agent itself. IMO would have to be considered as part of the shipper and not an input parameter.

(fyi @cmacknz )

@zez3
Copy link
Author

zez3 commented Apr 2, 2022

@nimarezainia

These internal queues are a per beat concept and more closely aligned with writing to an output. Many of these parameters are there for users to modify the throughput they get from the beat.

Is this not what we need? Or am I not missing something here? Also, what do you mean efficiently? Are you referring to the 2 extra fsyncs or some specific fs like btrfs, or spool file locking?

I am asking here for the option to be included in Fleet policy. With the appropriate limitations and page pool warnings.
Events are reaching(from syslog streams) the agent(or underlying beats) but the output is too busy or unavailable(down for maintenance/upgrades, or network is down),so we need a bigger buffer or local file that can temporary store the events for some user defined time or max buffer size. The workaround proposed by some(cough Professional Services) was Kafka mq but I think we are beyond that now.
The other additional helpful option, would be throttling.
None available at this moment for enterprise clients in fleet but some available on standalone. From my point of view I would like to have in fleet all the standalone options but I can settle for just enough.

@zez3
Copy link
Author

zez3 commented Apr 4, 2022

@joshdover
Is this is not available or will not be included in the upcoming release then please add the future shipper memory queue
elastic/elastic-agent-shipper#7
on the roadmap

@nimarezainia
Copy link
Contributor

@zez3 We are working on providing output level configurations (such as loadbalancing, disk queue and performance tuning). We don't have a timeline as yet when these will be available but needless to say this is prioritized. A lot of the issues to track you have seen already, listing them below. Be aware that these are tracking all the internal architectural work that's on going. It's unknown exactly which release the UI work would be done to expose the configuration parameters to the users.

elastic/elastic-agent-shipper#7
elastic/elastic-agent-shipper#33

@zez3
Copy link
Author

zez3 commented Jun 8, 2022

As long as I have an APl that I can call, I would be happy to test the new shipper on disk queue. The UI part can be later implemented. And hopefully we ca finally solve drops in my deployment. ;)
Regarding queue encryption I think this will slow things down but I suppose that impact will be seen and confirmed in the benchmarks. I'll add my 2 security cents on that issue

@joshdover
Copy link
Contributor

As long as I have an APl that I can call, I would be happy to test the new shipper on disk queue. The UI part can be later implemented. And hopefully we ca finally solve drops in my deployment. ;)

Yep we're discussing having a configuration API to enable this as a first step to enable testing use cases before we rollout the UI

@andrewkroh
Copy link
Member

Is there a workaround that we can document in this issue? Like is possible to set queue.mem.events for the Beats through changes to local files like elastic-agent.yml?

@zez3
Copy link
Author

zez3 commented Jul 13, 2022

is possible to set queue.mem.events for the Beats through changes to local files like elastic-agent.yml?

Not sure if that would work for an Fleet Managed Agent with its underlying beats.

@cmacknz
Copy link
Member

cmacknz commented Jul 13, 2022

is possible to set queue.mem.events for the Beats through changes to local files like elastic-agent.yml?

Not sure if that would work for an Fleet Managed Agent with its underlying beats.

Yes the agent policy only lets us control the contents of the beat input and output sections. Since the queue section of a beat configuration is a separate top level configuration block there is no way to modify it through an agent policy without modifying the policy to beat config transformation logic.

@zez3
Copy link
Author

zez3 commented Jul 13, 2022

Oh, now I remember, I tried half an year ago to abuse the Fleet policy API call and force an json containing the mem queue but that failed. I think I have discussed this, at that time with @ruflin

@zez3
Copy link
Author

zez3 commented Jul 13, 2022

If 8.4 will soon be released perhaps someone could allow a custom policy where we define such options?

@andrewkroh
Copy link
Member

andrewkroh commented Jul 22, 2022

there is no way to modify it through an agent policy without modifying the policy to beat config transformation logic.

I found code in elastic-agent for an InjectQueue spec rule to derive queue.mem.* settings based on bulk_max_size and worker. It was added in elastic/beats#27429 and then removed from use in elastic/beats#27653 with a reason of "but we aren't ready to include it in an official release yet."

Perhaps we could add that back to spec files for Beats now?

Or we wanted to be less prescriptive than the InjectQueue rule, we could add a different transform that takes any queue config from the output settings and moves it to the root of the config for Beats. Then users could manually define their queue settings like

Screen Shot 2022-07-22 at 10 50 23

@joshdover
Copy link
Contributor

Or we wanted to be less prescriptive than the InjectQueue rule, we could add a different transform that takes any queue config from the output settings and moves it to the root of the config for Beats. Then users could manually define their queue settings like

I like this idea, but one concern is the same that I raised here: elastic/elastic-agent-shipper#28 (comment)

We're going to be moving to a single shared queue for all integrations with the shipper and if we expose these settings now, how they get applied once there's a single queue will not be the same, making the shipper rollout a breaking change. IMO we should defer on this to provide a good upgrade experience to the shipper once it's GA.

@cmacknz
Copy link
Member

cmacknz commented Sep 14, 2023

We are planning to expose the queue parameters (although perhaps not the disk queue initially) as part of the agent output configuration, that work is tracked in elastic/beats#35615

@gadekishore gadekishore changed the title Allow configuration of Agent(+)Beats Internal queue (on disk queue) [Response Required] Allow configuration of Agent(+)Beats Internal queue (on disk queue) Sep 19, 2023
@gadekishore gadekishore changed the title [Response Required] Allow configuration of Agent(+)Beats Internal queue (on disk queue) Allow configuration of Agent(+)Beats Internal queue (on disk queue) Sep 19, 2023
@jlind23
Copy link
Contributor

jlind23 commented May 27, 2024

Queue configuration has been added hence closing this as done.

@jlind23 jlind23 closed this as completed May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants