Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-s3 input's bucket polling accumulates state in the registry #39116

Closed
faec opened this issue Apr 22, 2024 · 5 comments · Fixed by #41817
Closed

aws-s3 input's bucket polling accumulates state in the registry #39116

faec opened this issue Apr 22, 2024 · 5 comments · Fixed by #41817
Assignees
Labels
bug Team:Cloud-Monitoring Label for the Cloud Monitoring team Team:Elastic-Agent Label for the Agent team

Comments

@faec
Copy link
Contributor

faec commented Apr 22, 2024

When scanning an S3 bucket, metadata from each object is saved to the registry (including whether it has been successfully downloaded). Each object's metadata consumes approximately 1KB of space in the registry.

The intention in the code was for this metadata to be deleted after a bucket scan, but this deletion was implemented incorrectly (see also #39065), so most S3 object metadata is persisted forever and never cleaned up. This accumulates even after objects have been removed from the original bucket, or the target bucket has been changed, so that the input adds ~1GB to the registry for every million objects it has ever seen across all time and all buckets. These objects are also stored in memory during Filebeat execution and can significantly increase memory requirements on large buckets.

@faec faec added bug Team:Elastic-Agent Label for the Agent team Team:Cloud-Monitoring Label for the Cloud Monitoring team labels Apr 22, 2024
@faec faec self-assigned this Apr 22, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@cmacknz
Copy link
Member

cmacknz commented Nov 13, 2024

@faec did this get solved as part of the larger redesign of the input, or is this still outstanding?

@Kavindu-Dodan
Copy link
Contributor

Kavindu-Dodan commented Nov 14, 2024

I will look at this from a technical point of view (current implementation, potential improvements similar to what has been suggested, quick wins and follow-ups). Will update the issue with findings.

Update

We are discussing the possibility of,

  • Registry clean-up for S3 objects that we no longer observe when performing the listing
  • Register clean-up based on the storage class of the object [1] along with customer recommendations to utilize lifecycle policies to archive old s3 objects (further reducing costs) [2]

[1] - https://github.com/aws/aws-sdk-go/blob/v1.55.5/service/s3/api.go#L45823
[2] - https://aws.amazon.com/s3/pricing/

@Kavindu-Dodan Kavindu-Dodan self-assigned this Nov 14, 2024
@zmoog
Copy link
Contributor

zmoog commented Nov 18, 2024

@faec did this get solved as part of the larger redesign of the input, or is this still outstanding?

This issue is specific to the S3-polling mode and (AFAIK) still applies to the latest version. The @elastic/obs-ds-hosted-services team is working on it with the help of @faec for the review.

@Kavindu-Dodan
Copy link
Contributor

PR #41817 was merged on 07-Jan-2025 and configurations to avoid registry state growth, ignore_older & start_timestamp will be delivered with beats 8.18.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Cloud-Monitoring Label for the Cloud Monitoring team Team:Elastic-Agent Label for the Agent team
Projects
None yet
5 participants