Join Docker log lines when they are split because of size #6967

exekias · 2018-04-27T14:08:42Z

Docker json-file driver splits lines longer than 16k bytes, this
change adapts the code to detect that situation and join them again to
output a single event.

This behavior is enabled by default, it can be disabled by using the new
combine_partials flag.

Fixes #6605

ruflin · 2018-04-27T14:26:31Z

filebeat/harvester/reader/docker_json.go

+				return message, err
+			}
+			// Handle multiline messages, join lines that don't end with \n
+			for message.Content[len(message.Content)-1] != byte('\n') {


Should we enable this first through a config option and have it as default later on? This would allow us to start playing around with it.

@exekias I'm glad we have the docker prospector ;-)

That's a great idea, I'll add it

exekias · 2018-04-27T16:47:38Z

@ruflin I added the partial setting, after some time we should switch it to true by default, as this should be its default behavior

andrewkroh · 2018-04-27T19:27:06Z

filebeat/docs/inputs/input-docker.asciidoc

+===== `containers.partial`
+
+Enable partial messages joining. Docker splits log lines larger than 16k bytes, enable this
+to join them while reading (default: false).


I think it would be useful to specify that it works by joining lines that don't end with \n.

uhm that's something specific to the json-file format, it's not the user who adds the \n. Do you think it's worth it? If yes, I would explain the whole story about how the format works

I thought this whole input was specific to json-file log driver?

I do think it's useful for the user to understand what's happening if for nothing else than to avoid questions on the subject. I would add the info it's simple to explain, but if the explanation is complicated I wouldn't put it in the docs because it could confuse users.

ruflin · 2018-04-30T06:51:12Z

Code LGTM. For the config I was thinking to put it outside container and directly on the prospector level. Otherwise it gives the impression it's a container specific setting but I don't think it's really related to containers itself.

Naming wise I was thinking if we could find a name that is more directed into what it does instead of what problem it solves. So instead of partial more like combine_partial_messages which is too long but I hope describes what i mean. Any suggestions?

exekias · 2018-04-30T08:52:54Z

👍 How about combine_partials? I'm not too worried as this will be a rather exotic flag (once we set it to true by default)

ruflin · 2018-04-30T11:27:30Z

👍 on combine_partials. I just realised the the docker input is still in experimental, so we could already turn it on by default and see what happens. Still good to have a config option so in case things go wrong, we can turn it off.

exekias · 2018-05-07T15:50:23Z

I've renamed the flag to combine_partials, set it to true by default and updated docs

ruflin

LGTM. Could you add the config entry to the reference file?

ruflin · 2018-05-08T07:22:48Z

@exekias Could you rebase and squash your PR when updating the config file and update the commit and PR message?

Docker `json-file` driver splits lines longer than 16k bytes, this change adapts the code to detect that situation and join them again to output a single event. This behavior is enabled by default, it can be disabled by using the new `combine_partials` flag. Fixes elastic#6605

exekias · 2018-05-08T10:11:17Z

done!

bitsofinfo · 2018-05-08T14:06:31Z

nice!

Docker `json-file` driver splits lines longer than 16k bytes, this change adapts the code to detect that situation and join them again to output a single event. This behavior is enabled by default, it can be disabled by using the new `combine_partials` flag. Fixes elastic#6605

SharpEdgeMarshall · 2018-05-28T15:57:14Z

When should be available? version 6.4?

exekias · 2018-05-29T08:22:27Z

That's correct, it should be available by 6.4

towolf · 2018-06-13T19:20:56Z

@exekias This was not released with 6.3.0 today right?

bitsofinfo · 2018-06-13T19:32:21Z

?! hope it was.... really need this

bitsofinfo · 2018-06-13T19:34:09Z

Whats the eta on 6.4?

exekias · 2018-06-14T09:42:23Z

I'm afraid this didn't make it to 6.3.0, it should be available with 6.4.0. We can consider a backport to 6.3.1if we consider this a bugfix, any opinion here @ruflin ?

ruflin · 2018-06-15T12:36:02Z

As Carlos mentioned we normally don't backport any features and I don't think we can qualify it as a bug. The biggest worry with backporting is that it could break something that works now which can always happen. I personally would prefer to not backport but make sure it is as stable as possible in 6.4.

towolf · 2018-06-15T12:52:02Z

So the cadence is roughly quarterly and we can expect 6.4 in 3 months or so?

ruflin · 2018-06-15T13:03:35Z

We don't give an estimates on the release dates ;-) If you want to test / use the feature we have some snapshots available: https://beats-package-snapshots.s3.amazonaws.com/index.html?prefix=filebeat/

bitsofinfo · 2018-06-15T14:41:00Z

so 3 more months?

buddyledungarees · 2018-06-30T00:19:30Z

Is it possible to use existing multiline match parameters to replicate this functionality?

multiline.pattern: '\n$'
multiline.negate: true
multiline.match: after

wallies · 2018-07-06T02:34:13Z

We are running a fork of this fix, but in the for loop, we are noticing that it doesnt catch EOF of files and will sometimes exit half way through processing a long line. This means that the currently-processed partial log line will not have been completely processed. Later, when the next part of the now-half-processed log line arrives in the tailed log file, filebeat will fail to parse it. . Specifically, the following check will fail:

if strings.HasPrefix(string(message.Content), "{") {

Below is the parse error we see

ERROR`	log/harvester.go:243	Read line error: parsing CRI timestamp: parsing time ",\\\"protocol\\\":\\\" as "2006-01-02T15:04:05Z07:00": cannot parse

We have added a retry if io.EOF is reached for now, but im wondering whether the backoff code, that ive seen in other parts of the code, is getting into this bit of the code, or whether it should.

exekias · 2018-07-06T08:29:34Z

Hi @wallies, thank you for your feedback, and thank you for testing unreleased features! Could you please open a new issue for this one (or pull request 😉)? happy to discuss it there and try to fix it for 6.4

towolf · 2018-08-23T08:33:03Z

@ruflin it's been 32 months now. Can't you just push this out?

ruflin · 2018-08-23T09:07:31Z

@towolf It will be released with the rest of the stack in 6.4 hopefully pretty soon.

exekias added review Filebeat Filebeat labels Apr 27, 2018

exekias mentioned this pull request Apr 27, 2018

filebeats truncates log messages after 16k #6605

Closed

exekias force-pushed the docker-multiline branch from c5ef925 to 91ca896 Compare April 27, 2018 14:12

ruflin reviewed Apr 27, 2018

View reviewed changes

exekias force-pushed the docker-multiline branch from 285a44e to 691cf74 Compare April 27, 2018 17:05

andrewkroh reviewed Apr 27, 2018

View reviewed changes

towolf mentioned this pull request May 2, 2018

Improve partial message support in json-file logging driver for > 16k log entries moby/moby#36777

Open

exekias force-pushed the docker-multiline branch from 691cf74 to b96357f Compare May 7, 2018 15:46

ruflin approved these changes May 8, 2018

View reviewed changes

exekias force-pushed the docker-multiline branch from b96357f to cfc8798 Compare May 8, 2018 10:04

exekias force-pushed the docker-multiline branch from cfc8798 to 6909ce1 Compare May 8, 2018 10:11

kvch merged commit 1789ef9 into elastic:master May 8, 2018

wflyer mentioned this pull request Aug 31, 2018

Invalid CRI error on Filebeat with docker log join enabled #8175

Closed

matthughes mentioned this pull request Jan 16, 2019

Support grouping partial lines from Docker's journald log driver #10114

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Join Docker log lines when they are split because of size #6967

Join Docker log lines when they are split because of size #6967

exekias commented Apr 27, 2018 •

edited

Loading

ruflin Apr 27, 2018

exekias Apr 27, 2018

exekias commented Apr 27, 2018 •

edited

Loading

andrewkroh Apr 27, 2018

exekias Apr 30, 2018 •

edited

Loading

andrewkroh Apr 30, 2018

ruflin commented Apr 30, 2018

exekias commented Apr 30, 2018

ruflin commented Apr 30, 2018

exekias commented May 7, 2018

ruflin left a comment

ruflin commented May 8, 2018

exekias commented May 8, 2018

bitsofinfo commented May 8, 2018

SharpEdgeMarshall commented May 28, 2018

exekias commented May 29, 2018

towolf commented Jun 13, 2018

bitsofinfo commented Jun 13, 2018

bitsofinfo commented Jun 13, 2018

exekias commented Jun 14, 2018

ruflin commented Jun 15, 2018

towolf commented Jun 15, 2018

ruflin commented Jun 15, 2018

bitsofinfo commented Jun 15, 2018

buddyledungarees commented Jun 30, 2018

wallies commented Jul 6, 2018

exekias commented Jul 6, 2018

towolf commented Aug 23, 2018 •

edited

Loading

ruflin commented Aug 23, 2018

Join Docker log lines when they are split because of size #6967

Join Docker log lines when they are split because of size #6967

Conversation

exekias commented Apr 27, 2018 • edited Loading

ruflin Apr 27, 2018

Choose a reason for hiding this comment

exekias Apr 27, 2018

Choose a reason for hiding this comment

exekias commented Apr 27, 2018 • edited Loading

andrewkroh Apr 27, 2018

Choose a reason for hiding this comment

exekias Apr 30, 2018 • edited Loading

Choose a reason for hiding this comment

andrewkroh Apr 30, 2018

Choose a reason for hiding this comment

ruflin commented Apr 30, 2018

exekias commented Apr 30, 2018

ruflin commented Apr 30, 2018

exekias commented May 7, 2018

ruflin left a comment

Choose a reason for hiding this comment

ruflin commented May 8, 2018

exekias commented May 8, 2018

bitsofinfo commented May 8, 2018

SharpEdgeMarshall commented May 28, 2018

exekias commented May 29, 2018

towolf commented Jun 13, 2018

bitsofinfo commented Jun 13, 2018

bitsofinfo commented Jun 13, 2018

exekias commented Jun 14, 2018

ruflin commented Jun 15, 2018

towolf commented Jun 15, 2018

ruflin commented Jun 15, 2018

bitsofinfo commented Jun 15, 2018

buddyledungarees commented Jun 30, 2018

wallies commented Jul 6, 2018

exekias commented Jul 6, 2018

towolf commented Aug 23, 2018 • edited Loading

ruflin commented Aug 23, 2018

exekias commented Apr 27, 2018 •

edited

Loading

exekias commented Apr 27, 2018 •

edited

Loading

exekias Apr 30, 2018 •

edited

Loading

towolf commented Aug 23, 2018 •

edited

Loading