-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi worker-node setup failing to process when one node is removed/crashes #982
Comments
Can you test your setup with v1 branch @magrossi ? |
I can sure try. Is that branch suited for production use should it pass this small test? |
It is actually running in production on my side, however I've not wrapped it up to a release yet. |
Hi @behrad, The v1 branch seems incompatible with master from the user API perspective. Without changing my client code the new branch does not work. It fails connecting to REDIS I'm guessing. Any documentation/samples on how to start using this new branch? Thank you. |
It's 90% compatible. have you npm installed Kue's dependencies? |
Yes I did. |
aha... another thing is |
Ah that must be it! I'll have a try and let you know how it goes, thank you. |
@behrad facing the same issue, have upgraded to v1 and testing. Just to note your v1 branch does have more then 11 listeners on as single event - which is related to the number of jobs I guess in a kue. Do we really need that many, it exceeds nodes default threshold. |
Hi @behrad, |
@magrossi What changes did you make to get it work, for me it throws an error almost immediately. I have ioredis installed? |
@jamesjjk I had to change the options I sent out to Kue. You need to add "port" and "host" on the main object, whereas previously it used to sit in a "redis" property. var options = { Also, I had to remove any calls to "watchStuckJobs" as it is no longer in the API. If you changed your package.json to point to the v1 branch and did an npm install kue afterwards, it should be good to go. Regards. |
@magrossi Thanks worked out, still facing some exceptions with v1 and debugging. |
Hi,
I have two host VMs running my Kue worker process. If all nodes are online everything works fine, but during some tests if I bring one host down (by killing/suspending the VM - ie. ungraceful worker shutdown) the system gets out of sync.
Similarly with this issue #130 any new jobs are not processed right away, it will process the first inactive job on the queue, leaving the newest ones in inactive forever, until there are new jobs created and so on. Leaving always 2 inactive jobs to be processed.
Due to the nature of my app this breaks completely the user experience. We are calling the stuck job watcher every 1s [queue.watchStuckJobs(1000)] and even then this does not fix the issue.
Is this per design? If one worker crashes unexpectedly the whole system becomes unreliable? Is there anything I can do remedy this?
Thank you!
Kind regards
The text was updated successfully, but these errors were encountered: