-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tail plugin sends messages out of order #1746
Comments
Can you check if this problem is specific to the |
How would I go about doing that? |
Checked with
It seems to work:
|
I don't have too much familiarity with fluentd. But my broad impression using other things such as SumoLogic is that timestamps are the organising principle of data fragments at scale, not order of creation or order of delivery. SumoLogic for example typically supports the use of either "source" timestamps or "arrival" timestamps. If there is any buffering the "source" timestamps might be more accurate, but data from multiple sources is relying on those various clocks to be synchronised, using the same timezone, etc. |
@eplodn may I ask how does your fluentd config look like? |
@eplodn also what do you see if you use a 1MB Mem_Buf_Limit ? In my measurements if fluentbit sends more data than what fluentd can write out, there will be issues just like you reported. |
Order is not guaranteed if multiple chunks of data were generated (every chunk has N records), and upon flush/delivery time one output has to retry, that chunk will arrive later. Forcing an order will slow down things and add backpressure. |
@edsiper is this similar to flush threads in fluentd? can this behaviour be controlled? |
With normal fluentd you can use the buffer chunk key configuration to buffer logs by timestamp (and/or by source) before sending, so you can get some level of ordering depending on how long you wait for logs: https://docs.fluentd.org/configuration/buffer-section#time I'm currently investigating how to deal with this also (as seen in my most recent commit in grafana/loki#898 (comment)) |
@edsiper there are situations where the order is more important than throughput. Is there a flag to control this (disable concurrent chunk processing) for any given output? |
there is no such control at the moment, but there is a plan to implement that (Loki need it). |
@edsiper is this now implemented with the latest version? |
@edsiper Any update for this issue? And is possible to let output plugin to control the order even input messages are out of order. |
I also ran into this immediately while testing out Loki with FB. AFAICT, all data in the chunk containing out-of-order events are lost when using Loki. |
Fluent Bit does not send records in order, we optimize for performance than serialized messages. |
Note: the multiplex option is just for Loki connector that needs that feature, but slow down things |
Any update on this, or #3254? Does anyone have a workaround (other than switching to promtail or fluentd)? |
yeah please update on the same, looks like switching to promtail is the only option left, we use fluentBit -> fluentD -> loki -> S3 (Ceph) |
@edsiper I was wondering, in order to avoid breaking changes or compromises in performance, maybe it would be a good idea to add sequence numbers to chunks of data in certain plugins, where it makes sense. For instance, tail plugin reads data sequentially, so why not attach a sequence number to each line (i think? maybe chunk, but probably line), so that the order can be restored on the receiving side at the cost of performance, if they so choose? This would also allow one to make sure all data made it safe. |
It looks like for
|
Hmm well it's in merged in |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the |
This issue was closed because it has been stalled for 5 days with no activity. |
Bug Report
Describe the bug
We use fluentbit to forward logs to a centralized location. When writing 100,000 lines, all of them arrive, but the order is messed up (apparently at the sender side?).
To Reproduce
On sender:
On the receiving machine:
Expected behavior
I expected the lines to arrive in the same order as they were written.
Your Environment
Log:
Additional context
People looking at log files have an expectation that these will arrive in order.
The text was updated successfully, but these errors were encountered: