Multi worker processing #1386

tagomoris · 2016-12-22T11:06:28Z

This feature request is to implement "symmetric multi worker processing", which runs specified number of Fluentd worker processes, to use 2 or more CPU cores by just one configuration file.

All plugins used in this configuration MUST support multi worker feature, and MUST show it by #multi_worker_ready? method (returns true or false after #configure).
In default, all input and output plugins return false, and all other plugins return true. 3rd party plugins SHOULD claim whether it supports multi worker processing or not.

Points:

buffer paths / local plugin storages are automatically configured to use "workerN" directory under server root directory (or directory specified by "path")
worker_id=0 takes care about buffer chunk files under the directory for single worker process configuration
when users decrease the number of workers, they should take care about buffer chunk files
- it works well to move buffer chunk files between directories (workers 4 -> 3, mv worker3/* worker2)
logger will print worker id in logs from each worker processes

tagomoris · 2016-12-22T11:08:04Z

I'm getting the workers works well:

``` $ bundle exec bin/fluentd -c example/in_forward_workers.conf 2016-12-22 20:02:15 +0900 [info]: reading config file path="example/in_forward_workers.conf" 2016-12-22 20:02:15 +0900 [info]: starting fluentd-0.14.10 pid=21028 2016-12-22 20:02:15 +0900 [info]: spawn command to main: cmdline=["/Users/tagomoris/.rbenv/versions/2.3.1/bin/ruby", "-Eascii-8bit:ascii-8bit", "-rbundler/setup", "bin/fluentd", "-c", "example/in_forward_workers.conf", "--under-supervisor"] 2016-12-22 20:02:16 +0900 [info]: reading config file path="example/in_forward_workers.conf" 2016-12-22 20:02:16 +0900 [info]: starting fluentd-0.14.10 without supervision pid=21059 2016-12-22 20:02:16 +0900 [info]: gem 'fluentd' version '0.14.10' 2016-12-22 20:02:16 +0900 [info]: adding match pattern="test" type="stdout" 2016-12-22 20:02:16 +0900 [info]: reading config file path="example/in_forward_workers.conf" 2016-12-22 20:02:16 +0900 [info]: starting fluentd-0.14.10 without supervision pid=21058 2016-12-22 20:02:16 +0900 [info]: gem 'fluentd' version '0.14.10' 2016-12-22 20:02:16 +0900 [info]: adding match pattern="test" type="stdout" 2016-12-22 20:02:16 +0900 [info]: adding source type="forward" 2016-12-22 20:02:16 +0900 [info]: using configuration file: workers 3 @type forward @type stdout worker_id_key "worker_id" 2016-12-22 20:02:16 +0900 [info]: starting fluentd worker pid=21059 ppid=21028 worker=2 2016-12-22 20:02:16 +0900 [info]: fluentd worker is now running 2016-12-22 20:02:16 +0900 [info]: reading config file path="example/in_forward_workers.conf" 2016-12-22 20:02:16 +0900 [info]: adding source type="forward" 2016-12-22 20:02:16 +0900 [info]: starting fluentd-0.14.10 without supervision pid=21057 2016-12-22 20:02:16 +0900 [info]: gem 'fluentd' version '0.14.10' 2016-12-22 20:02:16 +0900 [info]: adding match pattern="test" type="stdout" 2016-12-22 20:02:16 +0900 [info]: using configuration file: workers 3 @type forward @type stdout worker_id_key "worker_id" 2016-12-22 20:02:16 +0900 [info]: starting fluentd worker pid=21058 ppid=21028 worker=1 2016-12-22 20:02:16 +0900 [info]: fluentd worker is now running 2016-12-22 20:02:16 +0900 [info]: adding source type="forward" 2016-12-22 20:02:16 +0900 [info]: using configuration file: workers 3 @type forward @type stdout worker_id_key "worker_id" 2016-12-22 20:02:16 +0900 [info]: starting fluentd worker pid=21057 ppid=21028 worker=0 2016-12-22 20:02:16 +0900 [info]: fluentd worker is now running 2016-12-22 20:02:20.210008000 +0900 test: {"message":"yaaaaaaaaaaaaaaaaaaaaaaaaay","worker_id":2} 2016-12-22 20:02:21.137974000 +0900 test: {"message":"yaaaaaaaaaaaaaaaaaaaaaaaaay","worker_id":0} 2016-12-22 20:02:21.972813000 +0900 test: {"message":"yaaaaaaaaaaaaaaaaaaaaaaaaay","worker_id":0} 2016-12-22 20:02:22.808682000 +0900 test: {"message":"yaaaaaaaaaaaaaaaaaaaaaaaaay","worker_id":1} 2016-12-22 20:02:23.612803000 +0900 test: {"message":"yaaaaaaaaaaaaaaaaaaaaaaaaay","worker_id":1} 2016-12-22 20:02:29.273108000 +0900 test: {"message":"yaaaaaaaaaaaaaaaaaaaaaaaaay","worker_id":2} ```

tagomoris · 2016-12-27T02:58:20Z

Current status:

$ bundle exec bin/fluentd -c example/in_forward_workers.conf 
2016-12-27 11:55:07 +0900 [info]: reading config file path="example/in_forward_workers.conf"
2016-12-27 11:55:07 +0900 [info]: starting fluentd-0.14.11 pid=8877
2016-12-27 11:55:07 +0900 [info]: spawn command to main:  cmdline=["/Users/tagomoris/.rbenv/versions/2.4.0/bin/ruby", "-Eascii-8bit:ascii-8bit", "-rbundler/setup", "bin/fluentd", "-c", "example/in_forward_workers.conf", "--under-supervisor"]
2016-12-27 11:55:08 +0900 [info]: gem 'fluentd' version '0.14.11'
2016-12-27 11:55:08 +0900 [info]: adding match pattern="test" type="stdout"
2016-12-27 11:55:08 +0900 [info]: #2 starting fluentd worker pid=8908 ppid=8877 worker=2
2016-12-27 11:55:08 +0900 [info]: #2 [forward_in_1] listening a tcp port port=24224 bind="0.0.0.0"
2016-12-27 11:55:08 +0900 [info]: #2 fluentd worker is now running worker=2
2016-12-27 11:55:08 +0900 [info]: adding source type="forward"
2016-12-27 11:55:08 +0900 [info]: using configuration file: <ROOT>
  <system>
    workers 3
    root_dir "/Users/tagomoris/github/fluentd/test/tmp/root"
  </system>
  <source>
    @type forward
    @id forward_in_1
  </source>
  <match test>
    @type stdout
    @id stdout_out_1
    <inject>
      worker_id_key "worker_id"
    </inject>
    <buffer>
      @type "file"
      flush_interval 1s
    </buffer>
  </match>
</ROOT>
2016-12-27 11:55:08 +0900 [info]: #0 starting fluentd worker pid=8906 ppid=8877 worker=0
2016-12-27 11:55:08 +0900 [info]: #0 [forward_in_1] listening a tcp port port=24224 bind="0.0.0.0"
2016-12-27 11:55:08 +0900 [info]: #0 fluentd worker is now running worker=0
2016-12-27 11:55:08 +0900 [info]: #1 starting fluentd worker pid=8907 ppid=8877 worker=1
2016-12-27 11:55:08 +0900 [info]: #1 [forward_in_1] listening a tcp port port=24224 bind="0.0.0.0"
2016-12-27 11:55:08 +0900 [info]: #1 fluentd worker is now running worker=1
2016-12-27 11:55:21.513142000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":1}
2016-12-27 11:55:22.239894000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":0}
2016-12-27 11:55:22.606160000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":2}
2016-12-27 11:55:23.564816000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":2}
2016-12-27 11:55:23.950919000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":0}
2016-12-27 11:55:24.484824000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":0}
2016-12-27 11:55:23.159434000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":1}
2016-12-27 11:55:24.180292000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":1}
2016-12-27 11:55:24.792380000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":2}
2016-12-27 11:55:25.168374000 +0900 test: {"message":"yaaaaaaaaaaaaay","worker_id":0}
^C2016-12-27 11:56:04 +0900 [info]: Received graceful stop
2016-12-27 11:56:05 +0900 [info]: #0 shutting down fluentd worker worker=0
2016-12-27 11:56:05 +0900 [info]: #2 shutting down fluentd worker worker=2
2016-12-27 11:56:05 +0900 [info]: #1 shutting down fluentd worker worker=1
2016-12-27 11:56:06 +0900 [info]: Worker 2 finished with status 0
2016-12-27 11:56:06 +0900 [info]: Worker 1 finished with status 0
2016-12-27 11:56:06 +0900 [info]: Worker 0 finished with status 0

repeatedly · 2016-12-27T06:45:50Z

Before, we considered following syntax:

<workers 3>
  <source>
    @type forward
  </source>
  <match app.**>
    @type s3
  </match>
</workers>
<source>
  @type dstat
</source>
<match stat.**>
  @type mackerel
</match>

Is this dropped?

tagomoris · 2016-12-27T06:49:20Z

@repeatedly Yes, in my current opinion. It's not "symmetric".
I can understand such configuration workloads, but it's difficult to implement it in reasonable way.

tagomoris · 2016-12-27T06:58:20Z

I just found an idea to enable a configuration section only in specified worker process:

<system>
  workers 3
</system>
<source>
  # ...
</source>
<match data.**>
  # ...
</match>

<worker 0>
  <source>
    @type dstat
  </source>
  <match stat.**>
    @type mackerel
  </match>
</worker>

It's much easy to implement, and also easy to understand which worker runs the configuration specified by <worker> section.

repeatedly · 2016-12-27T07:01:46Z

It is acceptable configuration for me.

tagomoris · 2016-12-27T07:03:07Z

OK, i'll create an another issue for that feature request. It's too large to implement in this feature branch, and acceptable to implement in future versions, I think.

tagomoris · 2016-12-27T08:02:57Z

@repeatedly could you review this feature?

tagomoris · 2017-01-04T02:02:43Z

@repeatedly ping

repeatedly · 2017-01-04T05:59:19Z

lib/fluent/log.rb

+      worker_id_part = if type == :default && (@process_type == :worker0 || @process_type == :workers)
+                         @worker_id_part
+                       else
+                         ""


repeatedly · 2017-01-04T06:02:34Z

lib/fluent/log.rb

+        end
+
+        if plugin_id_configured?
+          @log.optional_header = "[#{@id}] "


No need #{self.class.name}?

Including plugin class name looks too verbose for me.
It's not useful for end users, and it's clear when -v specified via filename.

repeatedly · 2017-01-04T06:25:47Z

lib/fluent/plugin/buf_file.rb

@@ -67,33 +73,42 @@ def configure(conf)

        @@buffer_paths[@path] = type_of_owner

-        if File.exist?(@path)
-          if File.directory?(@path)
+        if File.exist?(@path) && File.directory?(@path) || !File.exist?(@path) && !@path.include?('.*') # directory


Need both File.exist?(@path) and !File.exist?(@path)?

Yes, because this clause must NOT match the condition of File.exist?(@path) && !File.directory?(@path).

If so, please use () to separate conditions.
&& || && combination in one line is hard to read and maintain.

repeatedly · 2017-01-04T07:00:48Z

lib/fluent/plugin/out_file.rb

+    end
+
+    def multi_workers_ready?
+      ### TODO: add hack to synchronize for multi workers


Do we need more tasks to improve multi workers support in out_file?

Ah, it's just mistake not to remove this comment.

repeatedly · 2017-01-04T07:05:46Z

lib/fluent/supervisor.rb

      show_plugin_config if @show_plugin_config
      read_config
      set_system_config

+      if @workers < 1
+        raise Fluent::ConfigError, "invalid number of workers:#{@workers}"


Invalid number of workers. Must be "> 0" is more clear for users.

repeatedly · 2017-01-04T07:39:52Z

lib/fluent/log.rb

+                       else
+                         ""
+                       end
+      log_msg = "#{time.strftime(@time_format)}[#{LEVEL_TEXT[level]}]: #{worker_id_part}"


How about inserting worker_id_part before :?
It seems clear that this is system header, not message body.

Changing there may breaks user-side integrations to match log level parts.

repeatedly · 2017-01-04T07:43:27Z

OT: We should write append related NOTICE in the out_file article.

tagomoris · 2017-01-04T08:28:41Z

I've added some commits for review comments.

repeatedly · 2017-01-05T01:23:14Z

when users decrease the number of workers, they should take care about buffer chunk files

Is it manual operation, right?

tagomoris · 2017-01-05T01:35:53Z

Exactly.

tagomoris · 2017-01-05T02:20:36Z

Updated/added some commits based on current master HEAD.

tagomoris · 2017-01-06T09:03:11Z

I'm pushing some other pull-requests for loggers and out_forward. So it's needed to resolve conflicts.
@repeatedly But anyway, are there any comments? Can I merge this after that?

repeatedly · 2017-01-09T22:50:14Z

LGTM.
After merged, users can test this feature with actual plugins.

…hown or not. In multi workers status, the same log messages will be shown 2 or more times without any controls. With this patch, loggger will show logs about process-wide control just once.

…lugins on all workers

…tion)

…multi workers

… line params)

…s for logging formats

tagomoris added enhancement Feature request or improve operations v0.14 work-in-progress labels Dec 22, 2016

tagomoris self-assigned this Dec 22, 2016

tagomoris force-pushed the multi-worker-processing branch from de1001a to 2b37783 Compare December 26, 2016 06:20

tagomoris mentioned this pull request Dec 27, 2016

<worker> section to enable a set of configuration in a specified worker #1392

Closed

tagomoris removed the work-in-progress label Dec 27, 2016

tagomoris requested a review from repeatedly December 27, 2016 08:01

tagomoris force-pushed the multi-worker-processing branch from e765bfe to 08db55d Compare December 27, 2016 08:02

tagomoris force-pushed the multi-worker-processing branch from 08db55d to 5314b7a Compare December 28, 2016 06:41

tagomoris mentioned this pull request Jan 4, 2017

Rolling restart workers #1397

Open

repeatedly reviewed Jan 4, 2017

View reviewed changes

tagomoris force-pushed the multi-worker-processing branch from 9d0367e to 2cf7086 Compare January 5, 2017 02:13

add "workers" configuration parameter in system config

674fb9c

tagomoris added 23 commits January 10, 2017 13:18

add a feature to inject worker id to let users know about it

fa0242c

enable "workers"

3b3bd9d

mark multi-workers-ready plugins

5f989b5

fix to show worker id when operating worker processes

9277271

worker ids are shown as integer in logs

7436df7

add "log type" and "process type" to control whether logs should be s…

b71cd9e

…hown or not. In multi workers status, the same log messages will be shown 2 or more times without any controls. With this patch, loggger will show logs about process-wide control just once.

it is too much to show all before_shutdown/close messages about all p…

d170177

…lugins on all workers

reduce # of log messages in loop/timers

aa65ca2

fix in_forward to work with multi workers

15c51a2

refactor out_file write operatoin for further fix (worker synchroniza…

2a17ed9

…tion)

fix out_file implementation to lock path during #write operation for …

d3264bb

…multi workers

fix bug not to show "shutting down" log message

f825845

add tests to execute 2 workers

3314d7e

remove the features #detach_process and #detach_multi_process

fcd8cc1

remove warnings

ab7e59e

freeze non-modified string

97c535a

make error message more descriptive

abaccf9

make code readable

44c06af

remove useless comment

8b98086

fix to read updated parameter values via command line options

9b7170a

update tests to use command line options correctly

21fb0e9

log level should be configured via params (updated params via command…

1e026a4

… line params)

enable shared socket only when multi workers configured

ec6a29a

tagomoris force-pushed the multi-worker-processing branch from 6e09a17 to ec6a29a Compare January 10, 2017 04:26

tagomoris added 3 commits January 10, 2017 15:58

remove invalid code which are duplicated through rebase

c25d14d

remove unused variable

615d3a1

fix to warn about logging configuration only on worker0, and fix test…

7eb3177

…s for logging formats

tagomoris merged commit 19ecdd0 into master Jan 10, 2017

tagomoris deleted the multi-worker-processing branch January 10, 2017 17:15

ashie mentioned this pull request Dec 24, 2021

root_dir/@id parameter documentation is not empirically correct #3552

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi worker processing #1386

Multi worker processing #1386

tagomoris commented Dec 22, 2016 •

edited

Loading

tagomoris commented Dec 22, 2016 •

edited

Loading

tagomoris commented Dec 27, 2016

repeatedly commented Dec 27, 2016

tagomoris commented Dec 27, 2016

tagomoris commented Dec 27, 2016

repeatedly commented Dec 27, 2016

tagomoris commented Dec 27, 2016

tagomoris commented Dec 27, 2016

tagomoris commented Jan 4, 2017

repeatedly Jan 4, 2017

repeatedly Jan 4, 2017

tagomoris Jan 4, 2017

repeatedly Jan 4, 2017

tagomoris Jan 4, 2017

repeatedly Jan 4, 2017 •

edited

Loading

repeatedly Jan 4, 2017

tagomoris Jan 4, 2017

repeatedly Jan 4, 2017

repeatedly Jan 4, 2017

tagomoris Jan 4, 2017

repeatedly commented Jan 4, 2017

tagomoris commented Jan 4, 2017

repeatedly commented Jan 5, 2017

tagomoris commented Jan 5, 2017

tagomoris commented Jan 5, 2017

tagomoris commented Jan 6, 2017

repeatedly commented Jan 9, 2017

Multi worker processing #1386

Multi worker processing #1386

Conversation

tagomoris commented Dec 22, 2016 • edited Loading

tagomoris commented Dec 22, 2016 • edited Loading

tagomoris commented Dec 27, 2016

repeatedly commented Dec 27, 2016

tagomoris commented Dec 27, 2016

tagomoris commented Dec 27, 2016

repeatedly commented Dec 27, 2016

tagomoris commented Dec 27, 2016

tagomoris commented Dec 27, 2016

tagomoris commented Jan 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

repeatedly Jan 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

repeatedly commented Jan 4, 2017

tagomoris commented Jan 4, 2017

repeatedly commented Jan 5, 2017

tagomoris commented Jan 5, 2017

tagomoris commented Jan 5, 2017

tagomoris commented Jan 6, 2017

repeatedly commented Jan 9, 2017

tagomoris commented Dec 22, 2016 •

edited

Loading

tagomoris commented Dec 22, 2016 •

edited

Loading

repeatedly Jan 4, 2017 •

edited

Loading