Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor trim incoming messages #6

Conversation

lbudai
Copy link

@lbudai lbudai commented Apr 4, 2019

note: it is possible to register callbacks for each state(before that we should unify the formal parameter list of the callbacks), but I'd stop here.

The refactored code is not fully equivalent to the original and for reason, namely, dependency between case branches (fallthrough) is eliminated. This kind of dependency makes it impossible to keep functions small. If you want to mimic the original behaviour, just call the state-handler directly after the state is changed (and return with the result of that call).

@lbudai
Copy link
Author

lbudai commented Apr 4, 2019

framed-proto-server

@szemere szemere force-pushed the trim_incoming_messages branch from cce53c6 to 8d45b47 Compare April 5, 2019 07:37
Copy link

@MrAnno MrAnno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for the last 3 commits (I validated them comparing everything to cce53c6).
I'm reviewing everything else in the main PR.

@szemere szemere force-pushed the trim_incoming_messages branch 2 times, most recently from 0594915 to d9bb667 Compare April 8, 2019 08:03
@szemere
Copy link
Owner

szemere commented Apr 8, 2019

Thank you for the changes! Unfortunately, I was unable to resolve the conflicts on GitHub, so I cherry-picked the commits manually.

@szemere szemere closed this Apr 8, 2019
szemere pushed a commit that referenced this pull request Sep 3, 2020
Parameterized tests are the worst in Criterion. Specially with pointers and ASLR.

Causes test to fail when executed with ASAN:
```
==3181004==ERROR: AddressSanitizer: SEGV on unknown address 0x7faeee7938a8 (pc 0x7f0f053880e7 bp 0x7faeee7938a0 sp 0x7ffe7fe59510 T0)
==3181004==The signal is caused by a READ memory access.
    #0 0x7f0f053880e7 in PyObject_Call (/usr/lib/libpython3.8.so.1.0+0x1420e7)
    #1 0x4f7f7c in python_ack_tracker_test_ack_tracker_factory_impl /home/kokan/src/syslog-ng/build/../modules/python/tests/test_python_ack_tracker.c:124:42
    #2 0x7f0f05225e3e in criterion_internal_test_main (/usr/lib/libcriterion.so.3+0x8e3e)
    #3 0x4f7e34 in python_ack_tracker_test_ack_tracker_factory_jmp /home/kokan/src/syslog-ng/build/../modules/python/tests/test_python_ack_tracker.c:119:1
    #4 0x7f0f05225308  (/usr/lib/libcriterion.so.3+0x8308)
    #5 0x7f0f05236b4c  (/usr/lib/libcriterion.so.3+0x19b4c)
    #6 0x7f0f04626151 in __libc_start_main (/usr/lib/libc.so.6+0x28151)
    #7 0x41f43d in _start (/home/kokan/src/syslog-ng/build/modules/python/tests/Debug/test_python_ack_tracker+0x41f43d)
```

Signed-off-by: Kokan <kokaipeter@gmail.com>
szemere pushed a commit that referenced this pull request Dec 9, 2020
Every time a key created a counter names is allocated. But it is not
freed. The only time it is freed when a new stats-cluster is created
from it (first time looked up) and the counter_names owenership
transfared to the stats-cluster.

With this patch the allocation is delayed to the ownership change, thus
only allocated if there is still a pointer to the memory.

Reproduction:
1. syslog-ng configuration with at least one network source
2. start syslog-ng with a leak detector
(2.1 optional do restart (in order to have the exact leak as above))
3. stop syslog-ng

=================================================================
==2095056==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x55a4f86db781 in calloc (/tmp/install/sbin/syslog-ng+0xc5781)
    #1 0x7f343f03a941 in g_malloc0 (/usr/lib/libglib-2.0.so.0+0x50941)
    #2 0x7f343acb37c7 in _make_connection_conter_stats_queryable /home/user/src/syslog-ng/build/../modules/afsocket/afsocket-source.c:1007:9
    #3 0x7f343acb30f3 in afsocket_sd_init_method /home/user/src/syslog-ng/build/../modules/afsocket/afsocket-source.c:1060:5
    #4 0x7f343acbc358 in afinet_sd_init /home/user/src/syslog-ng/build/../modules/afsocket/afinet-source.c:103:8
    #5 0x7f343f248e1a in log_pipe_init /home/user/src/syslog-ng/build/../lib/logpipe.h:301:23
    #6 0x7f343f248c88 in cfg_tree_start /home/user/src/syslog-ng/build/../lib/cfg-tree.c:1419:12
    #7 0x7f343f23af71 in cfg_init /home/user/src/syslog-ng/build/../lib/cfg.c:368:8
    #8 0x7f343f270b3c in main_loop_reload_config_apply /home/user/src/syslog-ng/build/../lib/mainloop.c:276:41
    #9 0x7f343f273162 in _consume_action /home/user/src/syslog-ng/build/../lib/mainloop-worker.c:240:3
    #10 0x7f343f27321f in _invoke_sync_call_actions /home/user/src/syslog-ng/build/../lib/mainloop-worker.c:250:7
    #11 0x7f343f2736ba in main_loop_worker_sync_call /home/user/src/syslog-ng/build/../lib/mainloop-worker.c:396:7
    syslog-ng#12 0x7f343f2709d1 in main_loop_reload_config_commence /home/user/src/syslog-ng/build/../lib/mainloop.c:340:3
    syslog-ng#13 0x7f343f275852 in control_connection_reload /home/user/src/syslog-ng/build/../lib/mainloop-control.c:167:3
    syslog-ng#14 0x7f343f290c80 in control_connection_io_input /home/user/src/syslog-ng/build/../lib/control/control-server.c:378:3
    syslog-ng#15 0x7f343efd4b2c  (/usr/lib/libivykis.so.0+0x6b2c)
    syslog-ng#16 0x7f343f239c50 in run_application_hook /home/user/src/syslog-ng/build/../lib/apphook.c:125:11

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x55a4f86db781 in calloc (/tmp/install/sbin/syslog-ng+0xc5781)
    #1 0x7f343f03a941 in g_malloc0 (/usr/lib/libglib-2.0.so.0+0x50941)
    #2 0x7f343acb3e27 in _stop_connection_counter_stats_queryable /home/user/src/syslog-ng/build/../modules/afsocket/afsocket-source.c:1027:9
    #3 0x7f343acb38bd in afsocket_sd_deinit_method /home/user/src/syslog-ng/build/../modules/afsocket/afsocket-source.c:1073:3
    #4 0x7f343f2491d4 in log_pipe_deinit /home/user/src/syslog-ng/build/../lib/logpipe.h:316:25
    #5 0x7f343f2490e5 in cfg_tree_stop /home/user/src/syslog-ng/build/../lib/cfg-tree.c:1439:12
    #6 0x7f343f23b2d1 in cfg_deinit /home/user/src/syslog-ng/build/../lib/cfg.c:387:10
    #7 0x7f343f271ec2 in main_loop_exit_finish /home/user/src/syslog-ng/build/../lib/mainloop.c:427:3
    #8 0x7f343f273162 in _consume_action /home/user/src/syslog-ng/build/../lib/mainloop-worker.c:240:3
    #9 0x7f343f27321f in _invoke_sync_call_actions /home/user/src/syslog-ng/build/../lib/mainloop-worker.c:250:7
    #9 0x7f343f27321f in _invoke_sync_call_actions /home/user/src/syslog-ng/build/../lib/mainloop-worker.c:250:7
    #10 0x7f343f2736ba in main_loop_worker_sync_call /home/user/src/syslog-ng/build/../lib/mainloop-worker.c:396:7
    #11 0x7f343f2709d1 in main_loop_reload_config_commence /home/user/src/syslog-ng/build/../lib/mainloop.c:340:3
    syslog-ng#12 0x7f343f275852 in control_connection_reload /home/user/src/syslog-ng/build/../lib/mainloop-control.c:167:3
    syslog-ng#13 0x7f343f290c80 in control_connection_io_input /home/user/src/syslog-ng/build/../lib/control/control-server.c:378:3
    syslog-ng#14 0x7f343efd4b2c  (/usr/lib/libivykis.so.0+0x6b2c)
    syslog-ng#15 0x7f343f239c50 in run_application_hook /home/user/src/syslog-ng/build/../lib/apphook.c:125:11

Signed-off-by: Kokan <kokaipeter@gmail.com>
szemere pushed a commit that referenced this pull request Jan 15, 2021
This commit accumulates synthetic messages in an array, and forwards them
after releasing the GroupingBy parser's lock.

It is NOT allowed to call log_pipe_forward_msg() when locks are held,
because it would slow down the entire pipeline, and it could also result
in deadlock, for example:

1. A new message is produced in syslog-ng's worker pool, which goes
   through the parser, so it acquires GroupingBy::lock, and starts
   processing the message (continues in step 3).

2. The main thread schedules grouping_by_timer_tick(), which needs to
   hold GroupingBy::lock, so it blocks and waits for the other thread.

3. A new synthetic message is emitted in the worker thread (step 1).
   stateful_parser_emit_synthetic() is called in
   grouping_by_expire_entry(), which is protected by GroupingBy::lock,
   The message goes through the pipeline with this lock held, reaches a
   destination, where main_loop_call(wait=TRUE) is called.

Worker waits for the main loop call to be scheduled and finished,
main thread waits for GroupingBy::lock, which will never be released.

A deadlock situation requires at least 2 locks:
- one was GroupingBy::lock;
- the other was main_task_lock and its GCond in main_loop_call().

Main thread:
```
#0  syscall ()
#1  g_mutex_lock_slowpath ()
#2  _grouping_by_timer_tick ()
#3  grouping_by_timer_tick ()
#4  iv_run_timers ()
#5  iv_main ()
#6  main_loop_run ()
#7  main ()
```

Worker thread:
```
#0  syscall ()
#1  g_cond_wait ()
#2  main_loop_call ()
#3  ml_batched_timer_postpone ()
#4  log_writer_queue ()
#5  log_dest_driver_queue_method ()
#6  log_multiplexer_queue ()
#7  log_multiplexer_queue ()
#8  log_filter_pipe_queue ()
#9  stateful_parser_emit_synthetic ()
#10 grouping_by_expire_entry ()
#11 timer_wheel_set_time ()
syslog-ng#12 grouping_by_set_time ()
syslog-ng#13 grouping_by_process ()
...
```

Important note:
The unlock trick wouldn't have worked in the timer expire callback case,
as we would be iterating through a non-locked data structure (TimerWheel)
that could be modified by other threads.

This commit is based on a00164c, which is
a similar deadlock fix for PatternDB.

Signed-off-by: László Várady <laszlo.varady@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants