-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filebeats truncates log messages after 16k #6605
Comments
+1 |
we encounter this with some messages as well, can it be fixed? |
I initially tried to reproduce this without docker to see if it is something in Filebeat itself or more likely related to Docker. As I didn't manage to reproduce it with local files I started to investigate a bit deeper into Docker. It seems this breaking change was introduced with 1.13 and is kind of a known issue: moby/moby#34855 With moby/moby#34888 it was made possible for log drivers to make the buffer size configurable, but if no buffer size is set it will fall back to the 16k we see above. As Filebeat reads the files from disk I assume the lines are already split up in the log file which docker created. I need to further investigate if the above is correct and if there is a config option in the json-file driver to change the buffer size. |
I did a test with the json-file logger and indeed in the file itself the log message it split up across two lines which means Filebeat cannot do much else then read the file line by line. I looked that |
I've noticed docker format doesn't add a
|
@exekias |
@f0 That is interesting. But as this is internal it is up to the specific logger to make use of it. Do you know if the json-file logger has something similar? |
@ruflin i chzecked with 17.09.1-ce , seems that the json-logger does not has this flag. I can confirm that the log ends with \r\n |
So does a ticket have to be opened w/ moby for this? |
@ruflin @f0 what seems more interesting is this work to add partial message metadata. Might be better to base filebeat work off this. moby/moby#35831 |
@wallies Thanks for the link. I only skimmed through the PR but it seems in case of partial data the json logger would skip the newline and put the second object on the same line. It would mean we have 2 (or more) json objects on the same line. Not sure yet how we would deal with it on our end but I'm sure we would find a way. From a Filebeat perspective I would prefer if the json logger would have a configurable buffer size so we would not have to glue things together. I assume a PR could be done against the json file logger to have support for it but I don't know if that would be an option from the moby project perspective. Anyone knows more here? |
@bitsofinfo It is probably worth to open an issue related to the Buffer support in json file logger to get a feeling if a PR in this direction would have a change in the moby project. |
@ruflin Would it have more weight if someone from elastic opened such an issue? I'm not sure which moby project or area of code to reference and point them to. Also concerned that any moby side solution will take months if not more than a year to actually get implemented :/ Nothing can be done in beats to handle this? |
As long as there is no flag or indication that the same event is spread to multiple lines there is not much beats can do. I think the quickest to get a change in is to open a PR and get it merged as I doubt someone would open the PR based on the Github issue if there is no real urgency for it. If the PR is open by a Beats team member of an external contributor I would assume does not make a difference. It would probably be best if someone would open it that has done a few changes in the past to moby / docker. Right now filebeat fully depends on the json file logger. There are other outputs like journald which in the future are potentially supported by filebeat, at least we are investigating it. This is also a long shot but perhaps some of these outputs already have these features in. @bitsofinfo The biggest pushback from the moby contributors I expect is why someone would have 16kb in a log line (based on the comments in the thread). The good news is that some PR's to make this possible were already accepted just so this might be an obsolete topic. But the changes never made it to the default logger so far. |
added an issue at moby/moby#36777 please chime in there. |
Is there any way to fix this in Filebeat without waiting for movement in moby/moby? Even if they add mechanisms that Filebeat can hook into, to puzzle partial messages back together, it's not a given that a new Docker release can be rolled out everywhere. Kubernetes usually states a supported Docker engine that is a few releases behind, and not everybody can freely choose their Docker version, and so on and so forth. |
I think we can fix this by checking the string ending, I'll do some tests |
+1 the changes will likely never happen in moby, or at least not within a reasonable timeframe |
I've open #6967 to handle partial messages when using the |
@exekias I'd love to help validating this. Can you throw a snapshot into a registry somewhere? |
I've pushed a build of my branch to Settings should look like this (take
|
@exekias I've tested your build and it works like magic. I hit the limit in ES, where an indexed field cannot contain more than 32k characters, but when I split the message into multiple fields, I could reach 300k with no problems. Thanks a lot for the quick action. |
Docker `json-file` driver splits lines longer than 16k bytes, this change adapts the code to detect that situation and join them again to output a single event. This behavior is enabled by default, it can be disabled by using the new `combine_partials` flag. Fixes elastic#6605
Docker `json-file` driver splits lines longer than 16k bytes, this change adapts the code to detect that situation and join them again to output a single event. This behavior is enabled by default, it can be disabled by using the new `combine_partials` flag. Fixes #6605
Docker `json-file` driver splits lines longer than 16k bytes, this change adapts the code to detect that situation and join them again to output a single event. This behavior is enabled by default, it can be disabled by using the new `combine_partials` flag. Fixes elastic#6605
Docker `json-file` driver splits lines longer than 16k bytes, this change adapts the code to detect that situation and join them again to output a single event. This behavior is enabled by default, it can be disabled by using the new `combine_partials` flag. Fixes elastic#6605
hi, apologies for hijacking the issue but based on the above I am not quite sure what the solution was. I am seeing this issue https://discuss.elastic.co/t/filebeat-logstash-output-split-messages-into-multiple-with-approximately-a-maximum-field-size-of-8191-charactes/330174/2 I use hint based filebeat to collect stdout logs from docker to logstash to ES. At the moment it is splitting the messages and I thought this was a problem with filebeat. However, it looks like a solution was found as part of this to send the messages together just by changing the configuration in filebeat? Can someone confirm if this is indeed in place on what version and how to tell filebeat to collate messages that are supposed to be together ? |
How to reproduce:
use my config, start a container and run the following:
This produces a valid json output. For me it cuts off at "somerand" and a new message continues with "omkey_396".
We also tried the following and it does not fix the issue: (as well as with and without the
harvester_buffer_size
option.)@paltryeffort https://discuss.elastic.co/t/filebeat-splits-message-after-16k/123718/4
The text was updated successfully, but these errors were encountered: