Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chore][fileconsumer] Remove generation field from reader #25897

Conversation

djaglowski
Copy link
Member

@djaglowski djaglowski commented Aug 18, 2023

The fileconsumer package remembers all files it has seen for a short period of time. This ensures that we do not double-ingest files in the rare case where a file "temporarily disappears", such as may happen when with a file rotation strategy that involves use of a temp file name that is outside the include glob. In such cases, we may know of a file during one poll interval, then subsequently fail to find it in the next, then find it again in the next.

We manage this short-term memory by remember files for a given number of poll intervals, currently hardcoded to 3. This PR restructures the way that we manage these generations. Previously, the reader struct had a field called generation, which was incremented when rotating generations. All previously known readers were stored in a single slice, which was appended to and cut as appropriate to rotate the generations.

Now, each generation is stored in a separate slice, so we can manage generations without the need to increment or check a generation field. This has a couple benefits:

  1. The management of generations is a concern that is outside the reader itself, so the reader should not have to carry this information or expose it externally as would be necessary if it were in a dedicated package.
  2. Management of generations may change in the future, but it's likely that some notion of generations will persist. Future designs can more easily slot into a []generation pattern, vs a []reader pattern.

@github-actions github-actions bot requested a review from atoulme August 18, 2023 19:21
@djaglowski djaglowski changed the title Pkg stanza fileconsumer extract generation [chore][fileconsumer] Remove generation field from reader Aug 22, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 6, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Sep 6, 2023
@djaglowski djaglowski removed the Stale label Sep 6, 2023
@djaglowski
Copy link
Member Author

This is failing due to race conditions in the flusher. As a result, I've taken a detour to simplify the tokenize package. (See #26241 as a potential end state.) I believe this will simplify the problem here as well.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@djaglowski
Copy link
Member Author

Resolved by #27396

@djaglowski djaglowski closed this Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant