-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there an advised way to handle jobs that get stuck in an active state? #391
Comments
solutions at the worker level code:
solutions at the kue built-in (not implemented yet):
|
Is there any plan on the built-in mechanism? |
No milestones for those two yet.
|
…failed. Uncatched exception make kue to stop, and make job stuck in unknown statement. Related to Automattic#391
How do you suggest pushing these broken jobs back on to the queue safely? |
|
Excellent, thank you! |
Yep that's the idea, I wasn't aware of some silent exceptions not bubbling up in my promises until today. Luckily it was only my dev env. Thanks for the hard work and advice, I will check it out!! |
Hi, How to query active jobs programmatically? I can't find it kue docs. Thanks |
I'm having this problem right now too. If an error happens you can catch it with domains/uncaughtException etc, but what if you just forgot to call done()? All other message queues I know of (RabbitMQ, Iron, AWS SQS) have the notion of timeouts where a job is retried if it runs past the TTL. Really hoping Kue gets this! |
then you shouldn't blame Kue for your stuck active jobs ;) and as I said above, TTL would be just a workaround, you will lose your queue concurrency bandwidth until TTL arrives for stuck jobs
I'm eager to implement TTL in kue, however I think default action should be marking job as failed. It will be then retried if it has remaining attempts. and bear in mind that Kue is a Job queue not message queue :) |
You say "however I think default action should be marking job as failed"-- how else do you do TTL's? I think that is what's being proposed. Forgetting to call done isn't always a programming flaw, functions can hang for lots of reasons unanticipated by the primary programmer, such as library flaws. Given the amount of functionality Kue does have, it just seems strange that it doesn't provide TTL functionality. Most of the popular alternatives I've used (SQS, IronMQ, RabbitMQ) seem to have it. Also, from my research, Job Queue's are a subset of Message Queues, so all Job Queues are Message Queues, which means Kue is a Message Queue. What do you mean it's not a Message Queue? What specific aspect of Message Queues doesn't apply to Kue? |
There's no argue Kue should and will have a TTL implementation.
Job Queues are more granular, and focused abstractions usually on top of message queues which are more related to batch processing, task distribution, workload management, ...
You can't say Resque, Celery, Kue, ... are Redis, RabbitMQ, ActiveMQ, ...! Can you? Message Queues are under the hoods to the job queues, and offer a more wide set of MOM and middleware as durability, reliability, types of queues, routing, pub/sub, selectable producer/consumer patterns, ...
Kue has no routing, wildcard consumers, message level configurability, ... |
Really helpful explanation, thanks! On Fri, Dec 12, 2014 at 2:27 PM, Behrad notifications@github.com wrote:
http://www.geoffplitt.com |
Document on using domains #403 and why Kue has not used them builtin !? |
I have a situation where a worker picks a job up, starts processing it, and then in the middle of processing, I get a fatal process error (due to bad OpenCV errors I can't seem to catch). PM2 restarts the app, and continues processing.
However, that job has been marked as active, and is now stuck.
Is there a way to clear out active jobs, pushing them back to failed if they are in an active state too long?
The text was updated successfully, but these errors were encountered: