Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically send status notifications to systemd #1029

Conversation

Mr0grog
Copy link
Contributor

@Mr0grog Mr0grog commented Aug 5, 2023

This adds built-in support for sending service status information to systemd as well as systemd’s watchdog (where it expects you to ping it every N seconds to let it know your process isn’t stuck and in need of a restart). It also adds an example systemd configuration file in examples/systemd/goodjob.service. Fixes #1027.

This works, but I’m not sure whether it’s totally ready. Some questions I was left with:

  • In Add systemd/sd_notify support to CLI #1027, you asked to vendor the sd_notify gem, which I did. It looks like this is the first vendored dependency, so I looked around and saw that Rubocop was already configured to expect a top-level vendor directory, so I put it there. However, that meant I had to add that directory to the .gemspec file, and also add a require_relative call in a lib file, which seems non-ideal. Should this go somewhere else?

    • Nope, this is good
    • In lib/good_job/vendor/<filename>
    • In lib/good_job/<filename>, undifferentiated from other library files
    • Somewhere else?
  • sorbet/tapioca/require.rb is missing some things that bin/tapioca require wants to add. I added the minimum needed to make Sorbet happy, but not sure it should be further updated to reflect what the command-line tools would otherwise add to it.

  • Logging: is using ActiveSupport notifications and adding methods to GoodJob::LogSubscriber for logging correct? I originally tried to use GoodJob.logger (based on configuration.rb and probe_server.rb) but realized that neither outputs to STDOUT nor tags the log lines. I feel a little awkward about stuff that should be very narrowly scoped to the CLI executable leaking out into a place for handling more generic events across the library.

  • The way watchdog support works is a little naive — it just starts a thread that sends notifications on a timer without checking anything. I think this is ok for a start (it’s the same as Sidekiq and Puma), but ideally, it would either:

    1. Check the status of various threads (I see the Probe Server kind of does this, but I wasn’t sure which things we should expect to always be good from a systemd perspective) or
    2. Should have other parts of the system proactively check in with it (to detect deadlocked threads or something).

    But really if you’re already confident that the GoodJob CLI crashes or gracefully exits anytime you’d need systemd to restart it, then the current behavior is fine.

  • Relatedly, I kind of wondered if the CLI should use ActiveSupport notifications to signal status things, and both this and the Probe Server should subscribe to those instead of being managed directly in GoodJob::CLI#start. But that seemed like growing the scope of this change unnecessarily.

    (One thing this immediately would get is notification about the scheduler_restart_pools event. Systemd has a RELOADING notification GoodJob would ideally send it when this happens. But GoodJob also already write a log about that, so it’s probably not a big deal.)

  • I cribbed a lot from Sidekiq and from @Roko131’s example in Add systemd example at documentation #988; I don’t know if I should credit those (particularly Sidekiq, as a separate project) somehow. I don’t think that’s super important for an example config file like this, but figure it’s worth calling out.

    (Also, some stuff in both of those seemed out of date and I cleaned it up — on Ubuntu 22, systemd screamed at me about using syslog, which is apparently no longer supported. You’re supposed to use journal now.)

  • I named the example service goodjob to match the example rc.d configuration. But I kind of intuitively though maybe it should be good_job? Not really a big deal.

Mr0grog added 9 commits August 4, 2023 14:13
This adds a somewhat naive `SystemdService` class that will send appropriate notifications to systemd or do nothing if it doesn't detect that the process is being run by systemd. It pretty much just notifies that the process is ready, is stopping, and actively notifies the watchdog while it's running. It doesn't reach into the actual job system to check that things are OK, or hook into ActiveSupport notifications to tell systemd about other events (like restarting/reloading).

Fixes bensheldon#1027.
Stubbed constants are very not cool as far as Sorbet is concerned, so the way I wrote these tests before broke the linter. OTOH, trying to figure out how to make this acceptable did get some slightly nicer test setup.
examples/systemd/goodjob.service Show resolved Hide resolved
examples/systemd/goodjob.service Show resolved Hide resolved
lib/good_job/systemd_service.rb Show resolved Hide resolved
@bensheldon
Copy link
Owner

To your other questions:

  • lib/good_job/<filename> is good. The file has enough boilerplate license at the top of the file that I think it's obvious that it's special.
  • Sorbet change looks good 👍🏻
  • Instrumentation + LogSubscriber methods is perfect. I agree that it isn't tightly scoped, but it's nice having a central place to format log messages.
  • I think directly adding the behavior to the CLI is good for now. It would be nice to have hookable events/callbacks for the behavior... but we can consider it YAGNI until there are maybe a few more hooks.
  • For RESTARTING that's good to note, but not necessary right now as there isn't a way to trigger a GoodJob restart externally (there's no HUP signal at the moment).
  • If you wanted to rename both the service names to be good_job, I wouldn't object; it's also fine to leave them as is.

@Mr0grog
Copy link
Contributor Author

Mr0grog commented Aug 5, 2023

OK, I moved the sd_notify.rb module into lib, so assuming this passes tests, it should be good to go.

It's good to know why these are skipped so we can stop skipping them in the future if the JRuby issue is fixed (or someone figures out a workaround).
@bensheldon bensheldon added the enhancement New feature or request label Aug 6, 2023
@bensheldon bensheldon merged commit f879537 into bensheldon:main Aug 6, 2023
anthonyroussel added a commit to lafourmiliere-benevolat/capistrano-good_job that referenced this pull request Jan 14, 2024
anthonyroussel added a commit to lafourmiliere-benevolat/capistrano-good_job that referenced this pull request Jan 14, 2024
anthonyroussel added a commit to lafourmiliere-benevolat/capistrano-good_job that referenced this pull request Jan 15, 2024
anthonyroussel added a commit to lafourmiliere-benevolat/capistrano-good_job that referenced this pull request Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

Add systemd/sd_notify support to CLI
2 participants