Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aws.vpcflow] Default max_number_of_messages to 1 #4599

Conversation

andrewkroh
Copy link
Member

@andrewkroh andrewkroh commented Nov 8, 2022

What does this PR do?

For users that are getting started with ingesting VPC flow logs with the aws-s3 input, using max_number_of_messages: 1 will provide a better experience. This is because each VPC flow log file usually contains many thousands of messages. For example a single S3 object might contain 100k events and this is suffiecient to keep the Agent's internal queue full. Having multiple S3 objects in flight by default often leads to timeouts or connection resets because the overall processing time for each object increases.

This setting usually needs to be tuned in conjunction with the queue.mem and output.elasaticsearch settings, and I think max_number_of_messages: 1 is better aligned to the queue and output defaults than 5.

This change will not affect users that currently have the integration added to policies. It will only affect new additions to agent policies.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Related issues

For users that are getting started with ingesting VPC flow logs
with the aws-s3 input, using max_number_of_messages: 1 will provide
a better experience. This is because each VPC flow log file usually
contains many thousands of messages. For example a single S3 object
might contain 100k events and this is suffiecient to keep the Agent's
internal queue full. Having multiple S3 objects in flight by default
often leads to timeouts or connection resets because the overall
processing time for each object increases.

This setting usually needs to be tuned in conjunction with the queue.mem
and output.elasaticsearch settings, and I think `max_number_of_messages: 1`
is better aligned to the queue and output defaults than 5.

This change will not affect users that currently have the integration
added to policies. It will only affect new additions to agent policies.
[git-generate]
cd packages/aws
elastic-package changelog add --link elastic#4599 --next minor --type enhancement --description "Change default max_number_of_messages for vpcflow to 1 because VPC flow log files normally contain a high number of events."
@andrewkroh andrewkroh force-pushed the bugfix/aws/vpcflow-max-number-of-messages-default branch from 5e94bc0 to 18d1a70 Compare November 8, 2022 22:53
@andrewkroh andrewkroh added enhancement New feature or request Integration:aws AWS labels Nov 8, 2022
@andrewkroh andrewkroh marked this pull request as ready for review November 8, 2022 22:56
@andrewkroh andrewkroh requested a review from a team as a code owner November 8, 2022 22:56
@elasticmachine
Copy link

elasticmachine commented Nov 8, 2022

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@elasticmachine
Copy link

elasticmachine commented Nov 8, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-11-08T22:54:01.804+0000

  • Duration: 36 min 52 sec

Test stats 🧪

Test Results
Failed 0
Passed 162
Skipped 2
Total 164

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (13/13) 💚
Files 92.857% (13/14) 👎 -4.806
Classes 92.857% (13/14) 👎 -4.806
Methods 84.232% (203/241) 👎 -7.18
Lines 95.697% (5204/5438) 👍 4.169
Conditionals 100.0% (0/0) 💚

@andrewkroh andrewkroh marked this pull request as draft November 23, 2022 04:49
@botelastic
Copy link

botelastic bot commented Dec 23, 2022

Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Dec 23, 2022
@andrewkroh
Copy link
Member Author

I'm going to close this because there were some changes to the aws-s3 input in elastic/beats#33658 that may mitigate this issue. We can monitor the situation and revive this if needed.

@andrewkroh andrewkroh closed this Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants