Trim large messages #2644

szemere · 2019-03-22T16:24:29Z

Add the trim-large-messages option to the logproto-framed-server. (syslog source driver)

Without trimming, the framed server simply drops oversized messages (>log_msg_size) and closes the incoming connection.
With trim-large-messages enabled, framed server will create a log message from the first (log_msg_size sized) part of the message, and ignores the rest of it. The communication will continue uniterrupted with the following message.

TODO:

The trim action can be propagated back towards LogReader via LogTransportAuxData, and can be marked on the new log message i.e. with a tag. But I don't like the idea, that LogReader has to deal with a property from one specific protocol.

kira-syslogng · 2019-03-22T16:47:43Z

Build FAILURE

kira-syslogng · 2019-04-01T05:55:10Z

Build SUCCESS

bazsi · 2019-04-01T05:57:01Z

This pull request introduces 1 alert when merging 0efdf2c3313ae1bad02dbe90a5d5fe160d92ecb6 into b1620af - view on LGTM.com

new alerts:

1 for Use of goto

Comment posted by LGTM.com

kira-syslogng · 2019-04-02T08:53:14Z

Build FAILURE

kira-syslogng · 2019-04-02T09:28:22Z

Build SUCCESS

furiel · 2019-04-03T13:19:04Z

Could you cover the case with unit test when a small message and a large message is available in the buffer, and the large message is not truncated below max_msg_size?
Something like:

max_msg_size: 10
kernel buffer: single read: 2 ab16 0123456789ABCDEF
expected behaviour:
msg_1: ab
msg_2: 0123456789 (and not 012)

lib/logproto/logproto-framed-server.c

furiel

Thanks!

kira-syslogng · 2019-04-04T06:26:30Z

Build FAILURE

szemere · 2019-04-04T07:03:01Z

@kira-syslogng retest this please

kira-syslogng · 2019-04-04T07:26:43Z

Build SUCCESS

lbudai · 2019-04-04T12:41:49Z

note to your ToDo: aux (or ancillary) data is a transport functionality(credentials, peer addr, pid, etc...), trim is a proto functionality.
Mixing them ends up in a layering violation, so do not extend LogTransportAuxData with proto related state/information.

lbudai

required : check Refactor trim incoming messages szemere/syslog-ng#6

Note:
I'm not sure if it's not possible to minimize the state machine.

lib/cfg-parser.c

lib/logproto/logproto-framed-server.c

lbudai · 2019-04-04T17:32:37Z

@szemere : and one more note to the other topic (the tagging issue): maybe it is time to modify the fetch method: what if we create a LogMessage object instead of 'returning' raw string buffer?

additionally corrected some nearby indentations Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

The framed server has never supported the encoding option, thus init_buffer_size == max_buffer_size == max_msg_size Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

As a preparation to make log_proto_framed_server_fetch states more independent from each other (will be easier to introduce a new state) eliminated the try_read variable with some code duplication. Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

Implementing the "trim_large_messages" option. (Currently aligning with the logic of log_proto_framed_server_fetch, with lots of goto statements and code duplication.) Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

in log_proto_framed_server_fetch into enums and continue statements Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

kira-syslogng · 2019-04-05T07:59:08Z

Build SUCCESS

lib/logproto/logproto-framed-server.c

On caller side we are only interested in the case when we were able to fetch any data from the input. Before this change EAGAIN was masked with LPS_SUCCESS, giving mixed results on caller side. Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

With the previous change of log_proto_framed_server_fetch_data, it was easier to create "read" states in log_proto_framed_server_fetch. By separating the "read" and "extract" steps, eliminated the code duplications caused by the "extract -> read -> extract" logic. Extract state will jump into read, if there is not enough data to finish. Read state will return LPS_SUCCESS, and continue later if there is nothing to read. Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

From now on frame header reading will ensure that there is enough space in the buffer for reading the frame header. (With a low chance of unsuccessfull parsing atempt.) And with the knowledge of the actual frame_len it will also make sure that later, there will be enough room for the message. Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

kira-syslogng · 2019-04-08T06:33:33Z

Build SUCCESS

kira-syslogng · 2019-04-08T07:27:53Z

Build FAILURE

I'm dismissing my review because I'm feeling sick and don't want to block the PR. Thanks for addressing my notes!

kira-syslogng · 2019-04-08T08:21:48Z

Build FAILURE

szemere · 2019-04-08T12:58:51Z

@kira-syslogng retest this please

kira-syslogng · 2019-04-08T13:21:50Z

Build SUCCESS

PR merged.

lib/logproto/logproto-framed-server.c

from the state machine in log_proto_framed_server_fetch Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

split into smaller methods, each state got a separate method Signed-off-by: Laszlo Budai <stentor.bgyk@gmail.com>

Signed-off-by: Laszlo Budai <stentor.bgyk@gmail.com>

to prevent the starvation of other sources. Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

MrAnno · 2019-04-08T14:56:01Z

lib/logproto/logproto-framed-server.c

@@ -99,6 +101,9 @@ log_proto_framed_server_fetch_data(LogProtoFramedServer *self, gboolean *may_rea
  if (!(*may_read))
    return FALSE;

+  if (self->fetch_counter++ >= MAX_FETCH_COUNT)
+    return FALSE;


I think this is not necessary.
I just read about this yesterday, every edge-triggered server works like this: they have to read until EAGAIN is returned, and server programs usually do not deal with the possibility of starvation on this level.

In theory, starvation is possible in case of super-fast senders, but this is not a real-world scenario in my opinion.
The same question has been asked on the nginx forum a few years ago:
https://forum.nginx.org/read.php?29,250351,250360#msg-250360

When an in-memory loopback interface is used, it might be possible to reproduce a starvation problem, but in case of real network connections or files, the order of magnitude of reading from a memory buffer is different than the speed of those devices, so the problem does not really exist (unless the CPU has a very high load).

However, if we really want to care with this case, our other proto server implementations have to be adjusted as well.

Fortunately, epoll uses round robin, so the concept will actually work:

If more than maxevents file descriptors are ready when epoll_wait()
is called, then successive epoll_wait() calls will round robin
through the set of ready file descriptors. This behavior helps avoid
starvation scenarios, where a process fails to notice that additional
file descriptors are ready because it focuses on a set of file
descriptors that are already known to be ready.

Note: We have log-msg-size and log-iw-size, they can avoid starvation together with our flow-control mechanism even if flags(flow-control) is not set (but this is on a different level).

MAX_FETCH_COUNT adds back the old try try_read logic in a much cleaner way. Removing this patch did not really change our performance numbers, so I'm approving the PR.

kira-syslogng · 2019-04-08T15:01:55Z

Build SUCCESS

pzoleex · 2019-04-09T11:11:04Z

@kira-syslogng test this please test branch=pzolee-trim-large-messages;

kira-syslogng · 2019-04-09T11:33:38Z

Build SUCCESS

szemere force-pushed the trim_incoming_messages branch from 11ab942 to 0efdf2c Compare April 1, 2019 05:33

szemere changed the title ~~[WIP] Trim large messages~~ Trim large messages Apr 1, 2019

szemere marked this pull request as ready for review April 1, 2019 05:37

szemere force-pushed the trim_incoming_messages branch from 0efdf2c to 6c40bc9 Compare April 2, 2019 08:30

szemere force-pushed the trim_incoming_messages branch from 6c40bc9 to 9c4aefc Compare April 2, 2019 09:03

szemere mentioned this pull request Apr 2, 2019

syslog-ng disconnects client when message exceeds log_msg_size #881

Closed

MrAnno self-requested a review April 3, 2019 09:18

furiel reviewed Apr 3, 2019

View reviewed changes

lib/logproto/logproto-framed-server.c Show resolved Hide resolved

lib/logproto/logproto-framed-server.c Outdated Show resolved Hide resolved

lib/logproto/logproto-framed-server.c Outdated Show resolved Hide resolved

lib/logproto/logproto-framed-server.c Outdated Show resolved Hide resolved

szemere force-pushed the trim_incoming_messages branch from 9c4aefc to cce53c6 Compare April 4, 2019 06:04

furiel previously approved these changes Apr 4, 2019

View reviewed changes

lbudai previously requested changes Apr 4, 2019

View reviewed changes

lib/cfg-parser.c Outdated Show resolved Hide resolved

lib/logproto/logproto-framed-server.c Show resolved Hide resolved

szemere added 5 commits April 5, 2019 09:35

cfg: add trim_large_messages configuration option

8c4b071

additionally corrected some nearby indentations Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

logproto-framed-server: remove buffer reallocation

5583b5f

The framed server has never supported the encoding option, thus init_buffer_size == max_buffer_size == max_msg_size Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

logproto-framed-server: add trim large messages

ec62fbc

Implementing the "trim_large_messages" option. (Currently aligning with the logic of log_proto_framed_server_fetch, with lots of goto statements and code duplication.) Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

logproto-framed-server: transform defines and gotos

0ed9aaa

in log_proto_framed_server_fetch into enums and continue statements Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

szemere dismissed furiel’s stale review via 8d45b47 April 5, 2019 07:37

szemere force-pushed the trim_incoming_messages branch from cce53c6 to 8d45b47 Compare April 5, 2019 07:37

MrAnno previously requested changes Apr 5, 2019

View reviewed changes

szemere added 3 commits April 8, 2019 07:28

szemere force-pushed the trim_incoming_messages branch from 8d45b47 to 245bf0f Compare April 8, 2019 06:11

Kokan added this to the syslog-ng-3.21 milestone Apr 8, 2019

szemere force-pushed the trim_incoming_messages branch from 0594915 to d9bb667 Compare April 8, 2019 08:03

MrAnno reviewed Apr 8, 2019

View reviewed changes

lib/logproto/logproto-framed-server.c Outdated Show resolved Hide resolved

szemere and others added 5 commits April 8, 2019 16:33

logproto-framed-server: extract trimmed part consuming

6a09528

from the state machine in log_proto_framed_server_fetch Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

logproto-framed-server: refactor fetch method

5957a32

split into smaller methods, each state got a separate method Signed-off-by: Laszlo Budai <stentor.bgyk@gmail.com>

logproto-framed-server: add LogProtoFramedServerStateControl

4566aae

Signed-off-by: Laszlo Budai <stentor.bgyk@gmail.com>

logproto-framed-server: factor out state machine step method

6537689

Signed-off-by: Laszlo Budai <stentor.bgyk@gmail.com>

logproto-framed-server: limit the number of fetch_data calls

4cd0e8f

to prevent the starvation of other sources. Signed-off-by: Laszlo Szemere <laszlo.szemere@balabit.com>

szemere force-pushed the trim_incoming_messages branch from d9bb667 to 4cd0e8f Compare April 8, 2019 14:38

MrAnno reviewed Apr 8, 2019

View reviewed changes

szemere added the user-visible-feature label Apr 9, 2019

alltilla approved these changes Apr 9, 2019

View reviewed changes

MrAnno approved these changes Apr 9, 2019

View reviewed changes

alltilla merged commit 48624ed into syslog-ng:master Apr 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trim large messages #2644

Trim large messages #2644

szemere commented Mar 22, 2019 •

edited

Loading

kira-syslogng commented Mar 22, 2019

kira-syslogng commented Apr 1, 2019

bazsi commented Apr 1, 2019

kira-syslogng commented Apr 2, 2019

kira-syslogng commented Apr 2, 2019

furiel commented Apr 3, 2019

furiel left a comment

kira-syslogng commented Apr 4, 2019

szemere commented Apr 4, 2019

kira-syslogng commented Apr 4, 2019

lbudai commented Apr 4, 2019

lbudai left a comment

lbudai commented Apr 4, 2019 •

edited

Loading

kira-syslogng commented Apr 5, 2019

kira-syslogng commented Apr 8, 2019

kira-syslogng commented Apr 8, 2019

kira-syslogng commented Apr 8, 2019

szemere commented Apr 8, 2019

kira-syslogng commented Apr 8, 2019

MrAnno Apr 8, 2019 •

edited

Loading

MrAnno Apr 8, 2019

MrAnno Apr 9, 2019

kira-syslogng commented Apr 8, 2019

pzoleex commented Apr 9, 2019

kira-syslogng commented Apr 9, 2019

Trim large messages #2644

Trim large messages #2644

Conversation

szemere commented Mar 22, 2019 • edited Loading

kira-syslogng commented Mar 22, 2019

kira-syslogng commented Apr 1, 2019

bazsi commented Apr 1, 2019

kira-syslogng commented Apr 2, 2019

kira-syslogng commented Apr 2, 2019

furiel commented Apr 3, 2019

furiel left a comment

Choose a reason for hiding this comment

kira-syslogng commented Apr 4, 2019

szemere commented Apr 4, 2019

kira-syslogng commented Apr 4, 2019

lbudai commented Apr 4, 2019

lbudai left a comment

Choose a reason for hiding this comment

lbudai commented Apr 4, 2019 • edited Loading

kira-syslogng commented Apr 5, 2019

kira-syslogng commented Apr 8, 2019

kira-syslogng commented Apr 8, 2019

kira-syslogng commented Apr 8, 2019

szemere commented Apr 8, 2019

kira-syslogng commented Apr 8, 2019

MrAnno Apr 8, 2019 • edited Loading

Choose a reason for hiding this comment

MrAnno Apr 8, 2019

Choose a reason for hiding this comment

MrAnno Apr 9, 2019

Choose a reason for hiding this comment

kira-syslogng commented Apr 8, 2019

pzoleex commented Apr 9, 2019

kira-syslogng commented Apr 9, 2019

szemere commented Mar 22, 2019 •

edited

Loading

lbudai commented Apr 4, 2019 •

edited

Loading

MrAnno Apr 8, 2019 •

edited

Loading