Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unique index on [cron_key, cron_at] columns to prevent duplicate cron jobs from being enqueued #423

Merged
merged 1 commit into from
Oct 25, 2021

Conversation

bensheldon
Copy link
Owner

@bensheldon bensheldon commented Oct 11, 2021

Connects to #392.

  • Adds a new timestamp column cron_at that stores the time for which the cronned job has been enqueued
  • Adds a unique index on [cron_key, cron_at] to ensure that only one job is enqueued for the given key and time
  • Handles the expected ActiveRecord::RecordNotUnique when multiple cron processes try to enqueue the job

I'm honestly not sure whether this is a huge improvement to Cron because it sidesteps some of the tight race conditions of GoodJob::ActiveJobExtensions::Concurrency... or a terrible idea because of some as yet identified database performance pressure this will generate.

One bit of complexity here is that only one good_jobs record (the initial GoodJob::Execution) will have the cron_at value. So if the job is retried, the subsequent Executions in that retry chain will not have the cron_at value. Operationally it won't affect things, but is an inconsistency in job state across executions that disappoints me.

Note: I believe this should be a backwards-compatible/safe change; the unique indexes should be generatable because NULL values are not considered unique by PG, and the code will not try to fill-in the column values unless both the columns and indexes exist.

@bensheldon bensheldon temporarily deployed to goodjob-unique-cron-col-gtqtux October 11, 2021 23:18 Inactive
@bensheldon bensheldon temporarily deployed to goodjob-unique-cron-col-gtqtux October 12, 2021 00:35 Inactive
@bensheldon
Copy link
Owner Author

bensheldon commented Oct 12, 2021

This is what shows up in the logs from the expected failed inserts:

[ActiveJob]   TRANSACTION (8.1ms)  COMMIT
[ActiveJob]   GoodJob::Execution Create (13.7ms)  INSERT INTO "good_jobs" ("queue_name", "priority", "serialized_params", "scheduled_at", "created_at", "updated_at", "active_job_id", "cron_key", "cron_at") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9) RETURNING "id"  [["queue_name", "default"], ["priority", -10], ["serialized_params", "{\"job_class\":\"ExampleJob\",\"job_id\":\"b7cafca8-cee8-48fd-8bfc-543cbe173c7a\",\"provider_job_id\":null,\"queue_name\":\"default\",\"priority\":-10,\"arguments\":[\"slow\"],\"executions\":0,\"exception_executions\":{},\"locale\":\"en\",\"timezone\":\"UTC\",\"enqueued_at\":\"2021-10-12T00:36:45Z\"}"], ["scheduled_at", "2021-10-12 00:36:45.015142"], ["created_at", "2021-10-12 00:36:45.035467"], ["updated_at", "2021-10-12 00:36:45.035467"], ["active_job_id", "b7cafca8-cee8-48fd-8bfc-543cbe173c7a"], ["cron_key", "frequent_example"], ["cron_at", "2021-10-12 00:36:45"]]
[ActiveJob] Enqueued ExampleJob (Job ID: 3e75ae4f-94ff-46c4-abef-d5e864071580) to GoodJob(elephants) at 2021-10-12 00:36:45 UTC with arguments: "slow"
[ActiveJob]   TRANSACTION (0.2ms)  ROLLBACK
[ActiveJob] Failed enqueuing ExampleJob to GoodJob(default): ActiveRecord::RecordNotUnique (PG::UniqueViolation: ERROR:  duplicate key value violates unique constraint "index_good_jobs_on_cron_key_and_cron_at"
DETAIL:  Key (cron_key, cron_at)=(frequent_example, 2021-10-12 00:36:45) already exists.
)

Unfortunately it's not until Rails 7 that there will be a way for the Adapter to cleanly reject an enqueue (rails/rails#41191), which is why the ActiveRecord:: RecordNotUnique is caught higher in the call chain than I would like.

@bensheldon bensheldon temporarily deployed to goodjob-unique-cron-col-gtqtux October 12, 2021 01:12 Inactive
@aried3r
Copy link
Contributor

aried3r commented Oct 12, 2021

Should the migrations all use [5.2] as their version or should we perhaps use the current Rails version a user is using? Versions were introduced in rails/rails#21538 and some gems like paper_trail make use of the current version. See here and here (although ActiveRecord::VERSION::STRING.to_f would also work).

@bensheldon
Copy link
Owner Author

Should the migrations all use [5.2] as their version or should we perhaps use the current Rails version a user is using?

Good catch, but I don't think it matters. GoodJob's migrations should be compatible with the oldest supported Rails version. I think I was originally comforted by ActiveStorage, which hardcodes versions in its migrations: https://github.com/rails/rails/blob/826f947ce7078a66e93276c8102dd235bb629911/activestorage/db/migrate/20170806125915_create_active_storage_tables.rb#L1

@aried3r
Copy link
Contributor

aried3r commented Oct 12, 2021

I have to admit I don't know exactly how migration compatibility works in Rails, but it seems that migrations might have different results depending on the version, see this comment and their tests.

My assumption is that people would rather have migrations works the way they do on their current Rails version?

ActiveStorage, which hardcodes versions in its migrations

Interesting, I wonder why. I went back to the very first commit and it seems it's just accepted like that. Then again, the maintainers probably know exactly why they did this.

@bensheldon
Copy link
Owner Author

I've updated the migrations to include <%= migration_version %> introduced in #426.

@bensheldon bensheldon merged commit ec73ea3 into main Oct 25, 2021
@bensheldon bensheldon deleted the unique_cron_columns branch October 25, 2021 14:19
@zeevy
Copy link
Contributor

zeevy commented Jan 27, 2022

This is what shows up in the logs from the expected failed inserts:

[ActiveJob]   TRANSACTION (8.1ms)  COMMIT
[ActiveJob]   GoodJob::Execution Create (13.7ms)  INSERT INTO "good_jobs" ("queue_name", "priority", "serialized_params", "scheduled_at", "created_at", "updated_at", "active_job_id", "cron_key", "cron_at") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9) RETURNING "id"  [["queue_name", "default"], ["priority", -10], ["serialized_params", "{\"job_class\":\"ExampleJob\",\"job_id\":\"b7cafca8-cee8-48fd-8bfc-543cbe173c7a\",\"provider_job_id\":null,\"queue_name\":\"default\",\"priority\":-10,\"arguments\":[\"slow\"],\"executions\":0,\"exception_executions\":{},\"locale\":\"en\",\"timezone\":\"UTC\",\"enqueued_at\":\"2021-10-12T00:36:45Z\"}"], ["scheduled_at", "2021-10-12 00:36:45.015142"], ["created_at", "2021-10-12 00:36:45.035467"], ["updated_at", "2021-10-12 00:36:45.035467"], ["active_job_id", "b7cafca8-cee8-48fd-8bfc-543cbe173c7a"], ["cron_key", "frequent_example"], ["cron_at", "2021-10-12 00:36:45"]]
[ActiveJob] Enqueued ExampleJob (Job ID: 3e75ae4f-94ff-46c4-abef-d5e864071580) to GoodJob(elephants) at 2021-10-12 00:36:45 UTC with arguments: "slow"
[ActiveJob]   TRANSACTION (0.2ms)  ROLLBACK
[ActiveJob] Failed enqueuing ExampleJob to GoodJob(default): ActiveRecord::RecordNotUnique (PG::UniqueViolation: ERROR:  duplicate key value violates unique constraint "index_good_jobs_on_cron_key_and_cron_at"
DETAIL:  Key (cron_key, cron_at)=(frequent_example, 2021-10-12 00:36:45) already exists.
)

Unfortunately it's not until Rails 7 that there will be a way for the Adapter to cleanly reject an enqueue (rails/rails#41191), which is why the ActiveRecord:: RecordNotUnique is caught higher in the call chain than I would like.

It seems rails 7 still has this problem

Failed enqueuing Job to GoodJob(default): ActiveRecord::RecordNotUnique (PG::UniqueViolation: ERROR:  duplicate key value violates unique constraint "index_good_jobs_on_cron_key_and_cron_at"
DETAIL:  Key (cron_key, cron_at)=(fake_data_reset, 2022-01-24 23:00:00) already exists.
)

I have two process running in docker swarm, with rails 7.0.1

@bensheldon
Copy link
Owner Author

@zeevy thanks for the reminder! Rails has the capacity, but GoodJob hasn't been updated to take advantage of it. I'll write an Issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants