-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Due to failed reconnection, paying an invoice incurs unnecessary fees. #1003
Comments
Hi, how about investigating why the node was seen offline by your eclair desktop? Eclair will try to reconnect to a known peer and if it can't then it's worth investigating why, perhaps the MemeracingLN node change its IP address? Note that if the receiving node is really OFFLINE you won't be able to pay anyway. |
TL;DR: The connection was closed at 01:14:18,669, re-established at 08:50:53,868 and then the peer closed it at 08:50:53,991, and then "got new transport while already connected, switching to new transport" at 08:50:53,995, but ultimately lost the connection at 08:50:53,996. After that last loss there are NO attempts to re-establish the connection. Here are the log entries containing the node ID of the receiver before the payment was attempted:
Several repeats of the attempt and failure entries for seven hours or so, and then success:
My connection is kind of spotty (as you see automatic connection failed for seven hours). There are no other log entries with that node ID until several (I assume) normal log entries (regarding sending through the channel we have open) and then this:
...and then, 14 minutes later, because I manually asked to open a simple connection:
So, YES, there is a good reason that the receiver was offline (the sender's connection to it was spotty), but after the connection loss at 2019-05-13 08:50:53,996, there is no log entry with the receiving node's ID (which I interpret to mean no attempt to re-establish a connection). From this I interpret one bug:
... and one possible enhancement to handle the situation where:
|
Eclair tries to reconnect to the peers of which it knows the address and only if there are non-closed channel with them, the reconnection attempt happens at increasingly large delays (max delay is 60s). Unfortunately when the connection is initiated by remote then eclair doesn't store the IP address, looking at your logs i see eclair trying to connect to remote until |
I don't know how long you'd have to wait to get a eclair-core/src/main/scala/fr/acinq/eclair/db/sqlite/SqliteChannelsDb.scala: eclair-core/src/test/scala/fr/acinq/eclair/db/SqliteChannelsDbSpec.scala eclair-core/src/main/scala/fr/acinq/eclair/db/sqlite/SqlitePendingRelayDb.scala I searched for addOrUpdateChannel to identify callers that would have to pass in the connection info: eclair-core/src/test/scala/fr/acinq/eclair/db/BackupHandlerSpec.scala eclair-core/src/main/scala/fr/acinq/eclair/channel/Channel.scala I hunted around a bit more to see about adding (if it's not already there) the connection info to a channel. I found that the thing passed around (for example to addOrUpdateChannel) is a "hasCommitments" which has a nodeID and Commitments. Well, Commitments has a RemoteParams property which is a subclass of NodeParams which has a list of publicAddresses. The code can get the connection info from the hasCommitments and store it in the local_channels table. When, as in my case, an existing "OFFLINE" channel is best for a payment and hasn't been re-established because the remote node is the one that last initiated the connection, the connection info can be read from the database and the payment made directly. |
I have he same issue of offline channels even though they are actually online and my node should have all the necessary information to reconnect. My channel with Zigzag.io https://yalls.org/network/570892x1080x0 has this issue a lot even though it's a highly active channels that is routing a lot of payments as long as it is online. Unfortunately I have to manually reconnect every other day. The channel is outgoing, and according to 1ml they never had an IP address change, so my node should be able to automatically reconnect any time. @pm47 mentioned in the chat that @araspitzu is already working on a solution so I just want to provide my logs, in case it is of any help. This is when the connection was lost yesterday:
This morning upon checking my node I saw the channel being offline and manually reconnected. There was no attempt to reconnect or any other communication whatsoever in between.
According to yall's, it was the Zigzag.io node that terminated the connection. No idea why. My nodes connection is VERY stable (cable) and usually only has very short outages every couple of months. In fact, it was relaying a couple of payments pretty much every hour via other channels in that 10h time period when the Zigzag.io channel was offline. So it doesn't seem like my node had any connection issues. I understand that when a peer terminates the connection it is up to the peer to reconnect right now? No Idea why Zigzag.io doesn't try to reconnect but I see no reason why my node can't give it a try itself. |
is there an explanation for my reconnection issue to the Zigzag.io node mentioned above? If I read the description of your pull request right, reconnecting outgoing channels to nodes that didn't have an IP address change shouldn't be a issue even without this new fix?! |
I have two nodes. I created an invoice on one and paid it from the other. I paid it from Eclair Desktop. There is a channel (575638x611x1) from my Eclair node to my other node (MemeracingLN). Eclair says MemeracingLN is offline. However, MemeracingLN was the target of the payment and the invoice has been settled on the MemeracingLN node.
Eclair (Desktop, v 0.3) paid an invoice to a node without using its direct channel to that node because the node is recorded in EClair as offline.
If we are paying a node to which we have a channel then use the channel, and if the channel is offline, try a simple connection to it, and if that doesn't work, ask the user.
I have not attempted to reproduce. I would expect the code is written to have the behavior I got, simply because the reliability of the OFFLINE indication was assumed to be high. It isn't, so I propose the following solution:
Before finding a route to the receiving node, check if we have a channel to it and if so, do not route. Instead:
I am on Windows 10 using JDK 11.
The text was updated successfully, but these errors were encountered: