Cron scheduler and multiple processes #1128

olivier-thatch · 2023-10-30T22:50:46Z

Hello,

We're currently evaluating switching from Sidekiq to GoodJob, and I'd like to get some clarity on GoodJob's exact behavior when using the cron scheduler and multiple processes.

Right now we're using sidekiq-cron for scheduled jobs. sidekiq-cron has a built-in mechanism to ensure scheduled jobs only get enqueued once even when running multiple processes, but it is reliant on a Redis feature:

Sidekiq-Cron is safe to use with multiple Sidekiq processes or nodes. It uses a Redis sorted set to determine that only the first process who asks can enqueue scheduled jobs into the queue.

GoodJob's README seems to imply that it should be safe to have multiple processes running with the scheduler enabled:

GoodJob's cron uses unique indexes to ensure that only a single job is enqueued at the given time interval.

but then the configuration example below specifically enables scheduling on a single process:

# Enable cron in this process, e.g., only run on the first Heroku worker process
config.good_job.enable_cron = ENV['DYNO'] == 'worker.1' # or `true` or via $GOOD_JOB_ENABLE_CRON

Unfortunately our hosting provider doesn't have an equivalent to Heroku's $DYNO. Their recommendation was to decouple scheduling from the background worker service, i.e. have the worker service (Sidekiq/GoodJob) run without scheduling, and a separate service that is only responsible for enqueued scheduled jobs. This way the worker service can be scaled as needed and the scheduled job service can run on a single instance. This would be a bit annoying to implement and kind of a waste since the scheduled job service will do nothing 99% of the time.

Am I just worrying over nothing? Can we simply scale GoodJob over multiple instances without having to worry about scheduled jobs being enqueued multiple times and running concurrently?

The text was updated successfully, but these errors were encountered:

bensheldon · 2023-10-30T23:40:39Z

You don't need to worry, and I should rewrite that documentation to be a little clearer. Something like:

GoodJob's cron is safe to use with multiple processes or containers. GoodJob ensures only a single job is enqueued by placing a unique compound index on the jobs table that prevents the insertion of duplicate job records with the same cron configuration key and scheduled time ([cron_key, cron_at]).

The additional stuff is more like:

While entirely optional, if you did want to reduce duplicate key collisions from noisying up your logs, you could...

...though I should probably just remove that entirely.

olivier-thatch · 2023-10-31T00:59:17Z

Perfect, thanks for the quick reply! I'm looking forward to switching to GoodJob 🥳

olivier-thatch closed this as completed Oct 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cron scheduler and multiple processes #1128

Cron scheduler and multiple processes #1128

olivier-thatch commented Oct 30, 2023

bensheldon commented Oct 30, 2023

olivier-thatch commented Oct 31, 2023

Cron scheduler and multiple processes #1128

Cron scheduler and multiple processes #1128

Comments

olivier-thatch commented Oct 30, 2023

bensheldon commented Oct 30, 2023

olivier-thatch commented Oct 31, 2023