Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle no run when rescuing errors in TaskJob #375

Merged
merged 2 commits into from
Mar 18, 2021

Conversation

adrianna-chang-shopify
Copy link
Contributor

@adrianna-chang-shopify adrianna-chang-shopify commented Mar 15, 2021

Closes: #374

Errors in a Task can occur before the before_perform callback runs (ie. before @run) is set. The main scenario to be concerned about here is if deserializing the run fails (in the Partners app, this happened because the connection to the MySQL server was lost when attempting to deserialize). Another case to consider is that users might have a custom Job with a callback that raises for some reason, prior to our TaskJob#before_perform callback happening.

Right now, on_error blows up when @run is not defined. This leads to unintended consequences - namely, job retries happening that we are not in control of. The upside is that it makes the framework somewhat "resilient" (ie. the job might retry and succeed, leading the Task to continue running successfully), but this actually ends up being problematic because we can't determine exactly what will happen, and it might lead to the run getting into a state that messes up the whole system.

We should solve this by only calling methods on @run if it's defined. Otherwise, we can still call the error handler, but do so with an empty task_context hash.

I thought about checking if defined?(@run) && @run, but I'm pretty sure @run has to be present if it's defined since we control enqueuing TaskJob in the Runner. (So either we fail to deserialize properly and that blows up, or we set @run to a valid object. Since we're the only ones who enqueue this job via Runner#enqueue, I don't think this could ever get set to nil, it can only be undefined)

@adrianna-chang-shopify adrianna-chang-shopify requested review from etiennebarrie and removed request for etiennebarrie March 15, 2021 18:24
Copy link
Member

@etiennebarrie etiennebarrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a word to the README about the fact that the task_context may now be empty. I don't think it warrants a major bump but technically it is backward incompatible.
BTW we should fix

  • error: The object containing the exception that was raised.

I wonder whether we should rescue everything in on_error to handle similar issues. For example the `error_handler may raise an exception, it will bubble up to the queue adapter, which in the case of Sidekiq will retry. It's definitely not for this PR, I'm not sure it's a good idea because it may end up hiding problems, just thinking out loud.

@adrianna-chang-shopify
Copy link
Contributor Author

Yeah, good point about the error handler @etiennebarrie . Sidekiq rescues exceptions thrown by any error handlers, so might make sense on our end. If there's a chance of getting into a weird state with multiple workers due to a retry it's probably worth fixing. I'll add a card to our project board.

@adrianna-chang-shopify adrianna-chang-shopify merged commit 9dfd0fc into main Mar 18, 2021
@adrianna-chang-shopify adrianna-chang-shopify deleted the handle-no-run-in-on-error branch March 18, 2021 19:31
lawrencewong pushed a commit to lawrencewong/maintenance_tasks that referenced this pull request Apr 29, 2023
* Handle no run when rescuing errors in TaskJob
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't blow up in on_error if the run wasn't deserialized properly
2 participants