Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow job management (retry, destroy) through the Web UI #256

Closed
bensheldon opened this issue May 15, 2021 · 4 comments
Closed

Allow job management (retry, destroy) through the Web UI #256

bensheldon opened this issue May 15, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@bensheldon
Copy link
Owner

bensheldon commented May 15, 2021

Follow-up to #50 and reconsidered features by #255. Please share examples from other adapters.

@bensheldon bensheldon added the enhancement New feature or request label May 17, 2021
morgoth added a commit to tiramizoo/good_job that referenced this issue May 28, 2021
morgoth added a commit to tiramizoo/good_job that referenced this issue May 28, 2021
bensheldon pushed a commit that referenced this issue Jun 4, 2021
* Add deleting jobs from UI.

refs #256

* Improve deleting jobs

* Move deleting jobs to own controller
@bensheldon
Copy link
Owner Author

Thank you @morgoth for jumping into #265 🙏

I wanted to share some further thoughts/learnings from reviewing that:

  • Conceptually, I think we need to be clearer through the UI that the GoodJob::Job object is an "ActiveJob Job Run/Execution". Making ActiveJob "Jobs" (AJob) and GoodJob "Jobs" (GJob) have a conceptual collision is something I regret.
  • If you also implement "retries", I think that's something that needs to be acting on the ActiveJob Job to ensure that only the "head/latest" GJob is retried, to avoid having multiple executing GJobs for the same AJob. The retry operation will also need to be advisory locked, for the same reason. The latest PR Add deleting jobs from UI. #265 does have the risk of deleting a GJob that is currently executing, and would result in an error; I think the risk is low, but is something to clear up.

@jrochkind
Copy link
Contributor

I'm imagining the use case for "retry through the Web UI" being for failed jobs. By the time it gets to to good_job as a failure, ActiveJob has necessarily given up on it, yes? So if "retry through the Web UI" is limited to failed jobs, is there less concern with collision like that?

I am interested in this feature! As far as "Please share examples from other adapters", here is what resque looks like.

Screen Shot 2021-07-15 at 11 19 20 AM

There is a tab listing just "failed jobs". There is a button to retry all jobs in list; or they can be retried individually. It somehow does keep track of whether a given job was already "retried" or not --which seems potentially tricky to do in good_job architecture? The UI does not keep track of whether the retry succeeded or not (that would be nice). If the retry failed, it would show up a second time in the list, as a separate retried job.

I don't actually love the UI, just here as an example. I do love the ability to manually retry jobs that are not otherwise going to be retried, ActiveJob/resque have given up on them.

@bensheldon
Copy link
Owner Author

Thanks @jrochkind for sharing that example and your preferences. I'm in strong agreement.

I think this feature is ripe for implementation right now. Especially Retry:

  • It should operate on (the not-well named) ActiveJobJob that lives in the Engine (and could arguably be moved into the core lib)
  • it should wrap the operation in an advisory lock on the job
  • it should validate that the job is in a retryable state (e.g. it is "dead" or "failed", naming things is hard)
  • it should use as much of the ActiveJob retry logic as possible (e.g. this is an ActiveJob retry, not merely a re-enqueue the GoodJob object). This will mean incrementing the serialized exception_executions args, etc.

I think a similar strategy should be taken for Delete.

Some other lifecycle ideas:

  • Cancel? Thinking of Scheduled Jobs I think it might be helpful to be able to prevent them from ever executing. I also don't know if there would be the need to somehow terminate a long-running job, or expose a cancellation so that a job could that's iterating could check and bail.
  • Schedule Now. Thinking again of Scheduled Jobs, if you don't want to wait for the scheduled time. Probably useful if something has errored with incremental backoff, problem has been fixed, and they should run immediately.

@bensheldon
Copy link
Owner Author

Closing this because the core has been achieved. Please make additional requests/discussion in new Issues 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants