How to handle connection loss? #18

joypeterson · 2016-03-15T21:21:29Z

What is the best way to recover from a connection loss to the RabbitMq server from a client or from a listener? Currently if I lose the connection, then seneca calls root.die which terminates my application. I need to be able to try to re-establish the connection instead of exiting the application.

Thanks

nfantone · 2016-03-16T00:49:46Z

This is a good question. Actually, Seneca behaves like that by design. If your microservice breaks in some way, it shouldn't be kept alive. Instead, requests to that service should be redirected to or be responded by other instances while you revive the dead application (something like pm2 or forever would do that for you). That's one way to be tolerant to failure.

If you lose connectivity to the broker, your service becomes virtually useless - if it doesn't, I would argue that the application shouldn't be considered a microservice to begin with. Why would you want to have it around, trying to re-connect, wasting resources?

Nevertheless, having a reconnection strategy and being able to recover could be useful in some scenarios, where your application does, in fact, some other tasks (like answering HTTP requests) and you don't want to let those clients down.

joypeterson · 2016-03-16T16:12:32Z

Thank you for the feedback. The service that I am using this for does answer http requests in addition to using seneca-amqp-transport to coordinate with other services. It would be very useful to be able to keep that http endpoint available. It sounds like I might have to implement retry login in all the services that call that http endpoint though unless you have some ideas on how to intercept the connection error from amqplib and re-establish the connection without Seneca shutting down.

Thanks.

nfantone · 2016-03-16T17:07:00Z

I'm adding the request label and I'll review it. It may be worthwhile implementing this. In the meantime, I'd strongly suggest that you:

Do not rely on a broken microservice. Just let it die and restart it.
Consider splitting your HTTP endpoints to a separate, independent service/application.
Have redundant microservices in a virtual cluster (again, pm2 or forever can help here) or a distributed one, with some kind of load balancing.

deedubs · 2016-03-24T21:05:34Z

I'd like to vote in favour of the current functionality. IMHO it's always safer to fail fast and recover from a fresh instance than to reconcile once you've re-established connectivity.

nfantone · 2016-03-28T15:19:18Z

Agree with @deedubs.

Still, I'll be seeing what I can do about this.

mpseidel · 2017-01-13T11:13:43Z

@nfantone any updates on this? I understand the design decision generally but still struggle to decide if a basic recovery from a connection glitch to amqp would be better in our scenario.

We have a api gateway service that accepts http calls and routes messages to our rabbitmq-cluster. If a connection to one rabbit instance goes bad we might be able to quickly fail over to another instance without letting all current requests to the gateway container die.

Any suggestions?

nfantone · 2017-01-13T16:54:18Z

@mpseidel I've been very busy mostly rewriting the entire plugin. Work on #73 is now on develop and soon to be released. It should be now easier to add new stuff. So, I'd like to tackle this and other long pending feature requests next. I apologize for the delay. And, of course, we are always welcoming new PRs.

In the meantime, you should be load balancing requests to your RabbitMQ nodes. This is one the advantages of putting a cluster together. But, from your words, you are already using one.

If a connection to one rabbit instance goes bad we might be able to quickly fail over to another instance without letting all current requests to the gateway container die.

☝️ That is exactly what HA on RabbitMQ clusters are all about.

mpseidel · 2017-01-13T17:03:32Z

Thanks for the update! Yes we are failing over already but only after a crash of the node process caused by seneca that may cause multiple http requests to die as well.

brad-decker · 2017-06-22T23:25:31Z

Maybe unrelated, maybe not. We are working with docker and because depends_on doesn't wait for the container to be running and available we have to rely on 'sleep' to make microservices wait to start up until rabbitmq is theoretically available. Is there a way to gracefully handle retrying the connection for a period of time before killing the service?

nfantone added the question label Mar 16, 2016

nfantone added the request label Mar 16, 2016

nfantone self-assigned this Mar 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle connection loss? #18

How to handle connection loss? #18

joypeterson commented Mar 15, 2016

nfantone commented Mar 16, 2016

joypeterson commented Mar 16, 2016

nfantone commented Mar 16, 2016

deedubs commented Mar 24, 2016

nfantone commented Mar 28, 2016

mpseidel commented Jan 13, 2017 •

edited by nfantone

Loading

nfantone commented Jan 13, 2017 •

edited

Loading

mpseidel commented Jan 13, 2017 •

edited

Loading

brad-decker commented Jun 22, 2017

How to handle connection loss? #18

How to handle connection loss? #18

Comments

joypeterson commented Mar 15, 2016

nfantone commented Mar 16, 2016

joypeterson commented Mar 16, 2016

nfantone commented Mar 16, 2016

deedubs commented Mar 24, 2016

nfantone commented Mar 28, 2016

mpseidel commented Jan 13, 2017 • edited by nfantone Loading

nfantone commented Jan 13, 2017 • edited Loading

mpseidel commented Jan 13, 2017 • edited Loading

brad-decker commented Jun 22, 2017

mpseidel commented Jan 13, 2017 •

edited by nfantone

Loading

nfantone commented Jan 13, 2017 •

edited

Loading

mpseidel commented Jan 13, 2017 •

edited

Loading