In the `MessageBuffer`, detect if we've just been called by a fork child. #199

remeh · 2021-07-13T15:47:17Z

In order to clean the MessageBuffer in this case, because it can contains data from the parent process instead.

…ild.

ivoanjo

Left a few notes!

One thing that occurred to me is that another possible approach would be to move the fork checking to the Sender implementaiton. Doing so would allow a trivial optimization -- the default Sender can avoid all of this checking, because naturally its inner thread dies on fork, and so, while that thread lives, we're sure that no fork happened.

So only the SingleThreadSender would need to have the fork-checking behavior.

Also -- probably useful to add some tests and a microbenchmark to validate that the cost of calling Process.pid is reasonable :)

lib/datadog/statsd/forwarder.rb

lib/datadog/statsd/message_buffer.rb

remeh · 2021-09-01T09:30:46Z

One thing that occurred to me is that another possible approach would be to move the fork checking to the Sender implementation. Doing so would allow a trivial optimization -- the default Sender can avoid all of this checking, because naturally its inner thread dies on fork, and so, while that thread lives, we're sure that no fork happened.

@ivoanjo Please correct me if I'm wrong in my thinking but I think that what you are saying isn't correct: in both mode (with or without companion thread), we have to detect that we are now running in a different PID, because it means that we are in a new process (it can happen if an user is forking directly in their code or within a framework). Because of that fork, we have to clean the buffer of its existing data in the new process, to avoid having both processes sending the same data which was originally in the buffer the moment the fork happened. Does it make sense?

ivoanjo · 2021-09-01T10:39:41Z

One thing that occurred to me is that another possible approach would be to move the fork checking to the Sender implementation. Doing so would allow a trivial optimization -- the default Sender can avoid all of this checking, because naturally its inner thread dies on fork, and so, while that thread lives, we're sure that no fork happened.

@ivoanjo Please correct me if I'm wrong in my thinking but I think that what you are saying isn't correct: in both mode (with or without companion thread), we have to detect that we are now running in a different PID, because it means that we are in a new process (it can happen if an user is forking directly in their code or within a framework).

Yup!

Because of that fork, we have to clean the buffer of its existing data in the new process, to avoid having both processes sending the same data which was originally in the buffer the moment the fork happened. Does it make sense?

Not necessarily -- which was what prompted my suggestion (but it was more of a "may be interesting" and not a "I really think this needs to change" suggestion). In the current approach in this PR, the fork detection has been added to the MessageBuffer, and then every time we try to interact with it, we check the pid.

What I was saying is, that in the Sender class, there's another way we could detect a fork -- by checking that the background thread died. This is because threads do not survive forks:

[1] pry(main)> background_thread = Thread.new { sleep }
=> #<Thread:0x00007fc41a84c430 (pry):1 sleep>
[2] pry(main)> fork do
[2] pry(main)*   puts "background thread status: #{background_thread.inspect}"
[2] pry(main)* end
background thread status: #<Thread:0x00007fc41a84c430 (pry):1 dead>
=> 22780
[3] pry(main)> background_thread.inspect
=> "#<Thread:0x00007fc41a84c430 (pry):1 sleep>"
[4] pry(main)>

So an alternative in the regular Sender to checking the pid every time is to check if the thread is still alive. If it is, you're sure that a fork did not occur. If it is not, then it's possible that the checking process is a child process in a fork; so an option here is, rather than cleaning the MessageBuffer, just create a new MessageBuffer + background thread combo entirely.

But, as I said above, in the SingleThreadedSender this is not an option -- that one would still need the pid check.

remeh · 2021-09-01T11:28:49Z

Got it, thanks for the detailed explanation. Indeed performance-wise that may be interesting to do that instead in the original Sender class! This is where benchmarks will also be useful 👍 I'll try that.

remeh · 2021-09-03T12:31:03Z

Replaced by #203.

In the MessageBuffer, detect if we've just been called by a fork ch…

360f44b

…ild.

remeh force-pushed the remeh/fork-detect branch from 7ffba49 to 360f44b Compare July 13, 2021 15:56

ivoanjo reviewed Jul 14, 2021

View reviewed changes

lib/datadog/statsd/forwarder.rb Outdated Show resolved Hide resolved

lib/datadog/statsd/message_buffer.rb Show resolved Hide resolved

lib/datadog/statsd/message_buffer.rb Outdated Show resolved Hide resolved

ivoanjo mentioned this pull request Aug 13, 2021

Update README.md #202

Merged

forwarder: as they are counters, we can still emit telemetry per thread.

2410c17

message_buffer: add a #reset method.

bb14cea

remeh mentioned this pull request Sep 3, 2021

[sender] reset buffers on forks and reset the companion thread if dead or nil #203

Closed

remeh closed this Sep 3, 2021

remeh deleted the remeh/fork-detect branch September 3, 2021 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In the `MessageBuffer`, detect if we've just been called by a fork child. #199

In the `MessageBuffer`, detect if we've just been called by a fork child. #199

remeh commented Jul 13, 2021 •

edited

Loading

ivoanjo left a comment

remeh commented Sep 1, 2021

ivoanjo commented Sep 1, 2021

remeh commented Sep 1, 2021

remeh commented Sep 3, 2021

In the MessageBuffer, detect if we've just been called by a fork child. #199

In the MessageBuffer, detect if we've just been called by a fork child. #199

Conversation

remeh commented Jul 13, 2021 • edited Loading

ivoanjo left a comment

Choose a reason for hiding this comment

remeh commented Sep 1, 2021

ivoanjo commented Sep 1, 2021

remeh commented Sep 1, 2021

remeh commented Sep 3, 2021

In the `MessageBuffer`, detect if we've just been called by a fork child. #199

In the `MessageBuffer`, detect if we've just been called by a fork child. #199

remeh commented Jul 13, 2021 •

edited

Loading