Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design doc/ordering constraint removal #3268

Merged
merged 8 commits into from
Feb 25, 2021

Conversation

owen-d
Copy link
Member

@owen-d owen-d commented Jan 29, 2021

This contains a proposed design doc for removing the ordering constraint in Loki.

@ukolovda
Copy link
Contributor

Should file name be 2021-01-Ordering-Constraint-Removal.md (instead of 2020)?


In order to mitigate this, there are a few options (not mutually exclusive):
1) Lower the valid acceptance range
2) Create an _active_ validity window, such as `[most_recent_sample-max_chunk_age, now() + creation_grace_period]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could someone work around this by ingesting samples in decreasing order of timestamp?
But I think this is a good enough approach for now without adding a lot of complexity.

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the overall idea without adding a lot of complexity and without altering the performance benefits that users get who are sending ordered logs.

@owen-d
Copy link
Member Author

owen-d commented Feb 11, 2021

Addresses #1544

2) Create an _active_ validity window, such as `[most_recent_sample-max_chunk_age, now() + creation_grace_period]`.

The first option is simple, already available, and likely somewhat reasonable.
The second is simple to implement and an effective way to ensure Loki can ingest unordered logs but maintain a sliding validity window. I expect this to cover nearly all reasonable use cases and effectively mitigate bad actors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do that. If we don't restraint the boundary of a chunks, it will appear too often for queries and will lead to bad performance.

As of today the only way to have overlapping time between chunk of the same stream is replication. I believe this won't be true anymore. (Not a big problem just a statement.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I fully intend to use the second strategy here to slowly cinch the validity window. This will allow relatively out of order writes while preventing bad actors from writing samples i.e. one week apart over and over.

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I have doubt about synchronization but I guess we can tackle that once it's a problem.

@tlaukkan
Copy link

It would be really useful to have this implemented. Currently Loki is limited to use cases where log sources guarantee time stamp ordering.

@stevehipwell
Copy link
Contributor

This would be a game changer as far as allowing Loki to fit into complex log aggregation scenarios. We've had to break our fluent-bit -> fluentd -> log stores convention and push to Loki with fluent-bit as Fluentd can't keep the logs ordered without introducing significant cardinality; we still get the entry out of order error on fluent-bit but less frequently as the logs aren't aggregated.

@owen-d
Copy link
Member Author

owen-d commented Feb 25, 2021

Intending to work on this as soon as scheduling permits 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants