Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Complement image: propagate SIGTERM to all workers #13914

Merged
merged 3 commits into from
Sep 26, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/13914.misc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Complement image: propagate SIGTERM to all workers.
31 changes: 29 additions & 2 deletions synapse/app/complement_fork_starter.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,18 @@
import importlib
import itertools
import multiprocessing
import os
import signal
import sys
from typing import Any, Callable, List
from types import FrameType
from typing import Any, Callable, List, Optional

from twisted.internet.main import installReactor

# a list of the original signal handlers, before we installed our custom ones.
# We restore these in our child processes.
_original_signal_handlers: dict[int, Callable] = {}


class ProxiedReactor:
"""
Expand Down Expand Up @@ -105,6 +112,11 @@ def _worker_entrypoint(

sys.argv = args

# reset the custom signal handlers that we installed, so that the children start
# from a clean slate.
for sig, handler in _original_signal_handlers.items():
signal.signal(sig, handler)

from twisted.internet.epollreactor import EPollReactor

proxy_reactor._install_real_reactor(EPollReactor())
Expand Down Expand Up @@ -167,13 +179,28 @@ def main() -> None:
update_proc.join()
print("===== PREPARED DATABASE =====", file=sys.stderr)

processes: List[multiprocessing.Process] = []

# Install signal handlers to propagate signals to all our children, so that they
# shut down cleanly. This also inhibits our own exit, but that's good: we want to
# wait until the children have exited.
Comment on lines +184 to +186
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why our own exit is blocked waiting for our children to exit. Does os.kill() block until the child handles the signal? Or does we already have this for free thanks to process.join() further below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two halves to this:

  • installing a custom signal handler inhibits the default handler, which for sigint and sigterm causes the receiving process to exit immediately.
  • The process.join() stuff means we will exit once the children have exited.

os.kill doesn't block.

def handle_signal(signum: int, frame: Optional[FrameType]) -> None:
print(
"complement_fork_starter: Caught signal %i. Stopping children." % signum,
richvdh marked this conversation as resolved.
Show resolved Hide resolved
file=sys.stderr,
)
for p in processes:
os.kill(p.pid, signum)

for sig in (signal.SIGINT, signal.SIGTERM):
_original_signal_handlers[sig] = signal.signal(sig, handle_signal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crucially, this works because:

The previous signal handler will be returned

(I wish it was called set_or_replace_signal and not just signal. I guess the name comes from signal(2)?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the name comes from signal(2)?

... which has the same somewhat unintuitive semantics, yes. I guess it made sense 45 years ago.


# At this point, we've imported all the main entrypoints for all the workers.
# Now we basically just fork() out to create the workers we need.
# Because we're using fork(), all the workers get a clone of this launcher's
# memory space and don't need to repeat the work of loading the code!
# Instead of using fork() directly, we use the multiprocessing library,
# which uses fork() on Unix platforms.
processes = []
for (func, worker_args) in zip(worker_functions, args_by_worker):
process = multiprocessing.Process(
target=_worker_entrypoint, args=(func, proxy_reactor, worker_args)
Expand Down