Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay considering a channel closed when seeing an on-chain spend #2936

Merged
merged 13 commits into from
Dec 13, 2024

Conversation

remyers
Copy link
Contributor

@remyers remyers commented Nov 4, 2024

See issue #2437

When an external channel is spent, the router will add it to a new spentChannels map instead of immediately removing it from the graph.

If after 12 blocks an entry in the spentChannels map is not part of a splice, it will be removed as before.

When a newly added channel is validated, if it spends the shared output of a recently spent channel then it is a splice.

A splice will update the shortChannelId and capacity of graph edges of the parent channel instead of removing the parent's edges and adding edges for a new channel. A splice will also reuse and adjust the parent edge's low/high balance bounds information based on the splice.

@remyers
Copy link
Contributor Author

remyers commented Nov 4, 2024

@thomash-acinq how do you think a splice should change the low/high bounds estimate for a graph edge? This PR currenty increases the 'high' for the edge by the amount a spliced channel's capacity increases and decrease the 'low' for the edge by the amount it's capacity decreases. Is that correct? I tried to copy the logic used currently when an channel is added or removed, but only for the change in capacity instead of the entire channel's capacity.

@thomash-acinq
Copy link
Member

@thomash-acinq how do you think a splice should change the low/high bounds estimate for a graph edge? This PR currenty increases the 'high' for the edge by the amount a spliced channel's capacity increases and decrease the 'low' for the edge by the amount it's capacity decreases. Is that correct? I tried to copy the logic used currently when an channel is added or removed, but only for the change in capacity instead of the entire channel's capacity.

That looks good.

@remyers remyers marked this pull request as ready for review November 6, 2024 09:33
@remyers
Copy link
Contributor Author

remyers commented Nov 28, 2024

I refactored spentChannels in d7b9bea.

Now spentChannels will map from the txid of the tx that spends a funding tx, to the scid of the funding tx. If there are RBF attempts, then there will be multiple entries of (spending tx -> scid) .

Each spending tx will be watched until it has 12 confirmations (2 x the depth of a validated node announcement). I do not think we need a new config parameter for this. If we do add a new parameter, then it should never be less than 12 blocks to ensure that any channel announcement for a splice has time to confirm.

If a spending tx confirms after 12 blocks without a channel announcement being validated for the spending tx, then the normal flow for removing a channel occurs.

If a channel announcement is validated before the spending tx has 12 confirmations, then a new flow for updating the graph for a splice occurs.

When a spending tx confirms to remove or splice a channel, then all spentChannels entries that map to the scid of the original funding tx will be removed.

Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't fully reviewed the tests yet, but the implementation looks good to me, my comments are mostly nit! Can you rebase after addressing my comments, and then we'll work on finalizing the tests and merging this?

Issue ACINQ#2437

When an external channel is spent, add it to the `spentChannels` list instead of immediately removing it from the graph.

RBF attempts can produce multiple spending txs in the mempool for the same channel.

The `spendChannels` list maps the txid of the spending tx to the scid of the spent channel.

When a channel announcement is validated with a funding tx on the `spentChannels` list, consider the new channel a splice of the corresponding spent channel.

A splice updates the graph edges to preserve balance estimate information in the graph.

If a spending tx from the `spentChannels` list is deeply buried before appearing in a valid channel announcement, remove the corresponding spent channel edge from the graph. Also remove any corresponding spending tx entries on the `spentChannels` list.
Also fixed two other issues:
 - add the splice channel to the routing graph as a new channel if the parent channel does not exist (could be pruned)
 - when a splice confirms, also send ChannelLost to remove the parent scid from the FrontRouter
Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking mostly good, I need to spend a bit more time on tests. I've added a commit here with fixes for my comments: e85512b

I haven't fully fixed the tests though to expect the new commands sent to the watcher, I'll let you fix that!

t-bast and others added 2 commits December 10, 2024 20:08
We unwatch:

- channels that have been spliced or closed
- RBF candidates of a splice or closing transaction

We revert the modification to channel updates to avoid invalidating
the signature.
- Fixup so Router does not send `UnwatchTxConfirmed` only for the spending txs that will never confirm. Channel also has a `WatchTxConfirmed` event that may trigger later and should not be removed.
- Fixup channel integration tests to generate enough blocks to deeply confirm a channel has closed once before waiting for the channel to close.
- Fixup router tests to check that funding tx spend is unwatched after it's spending tx confirms
 - fix the three places I missed for generating blocks to create deeply confirmed closing txs
 - improved how the router looks for UnwatchedExternalSpent message to lookup the txid from the scid
 - changed the comment in Validation be more clear
Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's see if the e2e integration tests find any issues, but otherwise this should be ready to merge.

 - I am not sure yet why the Router does not update the scid/capacity after the splice.
@remyers
Copy link
Contributor Author

remyers commented Dec 11, 2024

I pushed a preliminary integration test in 87c4346

The scids do not update in the channels (we probably need PR #2941), and the Router does not seem to be getting the balance capacity update. Any tips?

- when the Router receives a AvailableBalanceChanged after a splice it now updates the capacity of the channel to the current capacity of the latest commitment.
- This test demonstrates that local channels update their capacity, but we can not test the remote node (carol) because the ChannelAnnouncement is ignored because it has a duplicate scid. After PR ACINQ#2941 we can fix this test.
 - the spent output should match the whole Outpoint of the funding tx
Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's go 🚀

@harding
Copy link

harding commented Jan 16, 2025

Out of curiosity, what's the motivation for tracking channels across splicing (rather than treating them as new channels)? The two reasons I can think of are:

  1. Faster acceptance of the updated channel information. E.g., we might not accept or relay channel announcements for new channels less than 6 blocks old (because sane nodes won't forward HTLCs away from the funder until reorg risk is minimal; however, splices are protected against reorg risk (up to the amount in the pre-splice channel), so bidirectional forwarding can continue).

  2. Channel age is being used (in part) to select routes, e.g. older channels are preferred over newer channels when all other things are equal. Splice detection allows imbuing the updated channel with the previous channel's age.

I'm also a little confused about the update to the low/high balance information. Based on a skim of the code, it looks to me like balance information for both splice-ins and splice-outs is being updated as soon as a splice is detected, but that's not what I expect. When we detect a splice-out (a decrease in capacity), that should be immediately reflected. When we detect a splice-in (an increase in capacity), that shouldn't be reflected until confirmed to a reasonable depth (since sane nodes won't allow forwarding HTLCs with that additional capacity until then).

@t-bast
Copy link
Member

t-bast commented Jan 16, 2025

The two reasons you're mentioning are exactly the reasons why we're doing this:

  • we want to keep the channel in our graph while the splice is confirming to keep using it when sending outgoing payments, because they can most likely be relayed (unless a large splice-out severely decreased the channel capacity, but it's worth a try)
    • if we didn't do that, we would remove the channel from our graph as soon as it's spent and would only re-add it a few blocks later when we receive the splice announcement, which means a ~6 blocks unnecessary "downtime" for that channel
  • channel age matters in path-finding, we've found that this is a very good indication of a healthy channel that is likely to relay payments, so it's worth preserving!
  • since gossip can be unreliable, being able to use the previous channel_updates immediately ensures that we don't have any unnecessary "downtime" trying to use that channel (and if relay fees changed after the splice, we will receive the updated channel_update with the payment failure which will let us retry immediately with the right parameters)

I'm also a little confused about the update to the low/high balance information. Based on a skim of the code, it looks to me like balance information for both splice-ins and splice-outs is being updated as soon as a splice is detected, but that's not what I expect. When we detect a splice-out (a decrease in capacity), that should be immediately reflected. When we detect a splice-in (an increase in capacity), that shouldn't be reflected until confirmed to a reasonable depth (since sane nodes won't allow forwarding HTLCs with that additional capacity until then).

You are correct, this is the expected behavior. This is actually what happens here, because we use availableBalanceForSend / availableBalanceForReceive on the Commitments object, which contains all pending splices and will use the most restrictive balance of all commitments. But if we've incorrectly implemented this, can you point me to the code that doesn't seem to do this?

@harding
Copy link

harding commented Jan 17, 2025

This is actually what happens here, because we use availableBalanceForSend / availableBalanceForReceive on the Commitments object, which contains all pending splices and will use the most restrictive balance of all commitments. But if we've incorrectly implemented this, can you point me to the code that doesn't seem to do this?

Sounds like you got it. Sorry for sounding concerned!

Thanks for the great reply, and to you all for adding this nice feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants