-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: stop subscription goroutines if node is stopped #404
Conversation
lnclient/lnd/lnd.go
Outdated
@@ -444,6 +447,12 @@ func NewLNDService(ctx context.Context, eventPublisher events.EventPublisher, ln | |||
default: | |||
payment, err := paymentStream.Recv() | |||
if err != nil { | |||
if grpcErr, ok := status.FromError(err); ok { | |||
if grpcErr.Code() == codes.Unknown { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codes.Unknown
feels wrong, but we should otherwise use grpcErr.Message() == "routerrpc server shutting down"
which looks hacky, what should I do here?
@im-adithya @bumi what about we shut down the LNClient if we detect LND is not running, and also do not allow Alby Hub to start if LND is not running. Is this possible? The current change only stops the subscriptions, and leaves the LNClient in a weird state |
pubkey string | ||
cancel context.CancelFunc | ||
client *wrapper.LNDWrapper | ||
nodeInfo *lnclient.NodeInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lmk if this is good, then I will do this for other backends too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we should just do this on every backend. If we want to do that, I'd much rather store the network on LNClient startup and add a call to get the network to the LNClient interface
lnclient/lnd/lnd.go
Outdated
@@ -445,7 +458,7 @@ func NewLNDService(ctx context.Context, eventPublisher events.EventPublisher, ln | |||
payment, err := paymentStream.Recv() | |||
if err != nil { | |||
logger.Logger.WithError(err).Error("Failed to receive payment") | |||
continue | |||
break paymentsLoop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this double for loop is a bit scary and I don't think this lndCtx.Done()
is hit if LND is not running (because this for loop will probably never be entered?)
@im-adithya this is a case with complicated logic - can you please write down the tests you did to cover these cases? Is there a simpler way we can implement it? (I'm not sure there is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this lndCtx.Done() is hit if LND is not running
Yes, this won't be hit because lndClient.SubscribePayments
call itself would fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes, this should be better and will also end the goroutine in all cases so it should be safer too.
Tests done to check subscription logic:
- Normal run to see if subscription receives event and handles it properly
- Started Hub (with node running in Polar) and then killed context from Hub's go terminal -> Goes to the <-lndCtx.Done() case instead of breaking the paymentsLoop because ctx is ended, thereby ending the goroutine
- Started Hub (with node running in Polar) and stopped the node in Polar after the subscription is started -> Breaks the payment loop after waiting for 2s and tries to subscribe again. It fails now and continues after waiting for 10s (this keeps going on and on)
- After (3), I turned the node on again, and the next attempt after 10s worked and started the subscription again 🔁
- After (3), I cancelled the context while it was waiting for 10s and it ended the goroutine due to switch case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@im-adithya awesome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
Fixes #388
We should also add a screen to display when the node is offline, which can be used by backends other than LDK, because currently Alby Hub goes blank if the node is offline with no message. But we can open a separate issue for that(?)