Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] aws-s3 input - incorporate Last-Modified time into _id #42040

Closed
andrewkroh opened this issue Dec 14, 2024 · 1 comment · Fixed by #42078
Closed

[Filebeat] aws-s3 input - incorporate Last-Modified time into _id #42040

andrewkroh opened this issue Dec 14, 2024 · 1 comment · Fixed by #42078
Labels
enhancement Filebeat Filebeat needs_team Indicates that the issue/PR needs a Team:* label

Comments

@andrewkroh
Copy link
Member

andrewkroh commented Dec 14, 2024

Describe the enhancement:

The AWS S3 input computes a document _id based on the bucket, object key, and data offset. However, if an object is mutated and subsequently reread, those _id values may not be unique. Incorporating the “Last-Modified” time (https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingMetadata.html#SysMetadata) into the value should ensure uniqueness. Since the “Last-Modified” time is not user-controllable, it is assumed to be unique each time an object is mutated.

Describe a specific use case for the enhancement or feature:

Better support use-cases where S3 objects are treated as mutable.

Workaround option

Add a processor the clears the _id field.

processors:
  # Clear generated document IDs because S3 objects are updated, and the IDs might be
  # the same for updated objects.
  - drop_fields:
      fields:
        - '@metadata._id'
      ignore_missing: true
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 14, 2024
@botelastic
Copy link

botelastic bot commented Dec 14, 2024

This issue doesn't have a Team:<team> label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Filebeat Filebeat needs_team Indicates that the issue/PR needs a Team:* label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant