Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

htlcswitch: return inbound channel update #6967

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

joostjager
Copy link
Contributor

@joostjager joostjager commented Oct 3, 2022

This PR shows the changes required to return the inbound channel update in a FeeInsufficient failure too. This would be an extension of #6703. The goal is to let a subsequent payment attempt succeed also in the case that the sender didn't have an up to date channel policy for the incoming channel and was using outdated inbound fees.

To remain backwards compatible, senders that are able to interpret long failure messages will signal this through onion tlv field 333. Its value is a single byte indicating the failure message version that the sender supports. To receive long failures, the sender should set the value to 1.

Builds on #6703.

@saubyk saubyk added the inbound fee Changes related to inbound routing fee label Nov 9, 2022
@saubyk saubyk added this to the v0.17.0 milestone Nov 9, 2022
@joostjager joostjager force-pushed the inbound-fees-signal branch 3 times, most recently from 7774d4d to 2f3b228 Compare December 5, 2022 11:51
@joostjager
Copy link
Contributor Author

Updated PR to not always return 1024 byte (padded) failure messages, because old senders would still choke on those. Currently it pads to 256 bytes. Longer messages aren't padded at all.

@t-bast
Copy link
Contributor

t-bast commented Dec 16, 2022

I'm trying to test this against ACINQ/eclair#2455 and I'm seeing issues.

First of all, I can't get lnd to send an error with a tlv stream: I do set tlv field 333 in the onion (the encoded tlv is: 0xfd014d0101), and send a payment Alice -> Bob -> Carol where the Bob -> Carol channel doesn't have enough liquidity, yet Bob doesn't add a tlv stream to the failure. Do I need to trigger a different, more specific error to get lnd to add tlvs?

Secondly, I tried it the other way around: eclair adds a tlv stream at the end of TemporaryChannelFailure. But this seems to break lnd:

2022-12-16 12:10:47.215 [ERR] CRTR: Unable to validate channel update: invalid signature for channel update (*lnwire.ChannelUpdate)(0xc00122fb80)({
 Signature: (lnwire.Sig) (len=64 cap=64) {
  00000000  c3 16 21 5b 71 35 90 bf  7e c2 54 a6 08 ea 6c 2c  |..![q5..~.T...l,|
  00000010  95 05 61 ed 1c 84 78 bf  ac 68 2d 03 2c cd 95 b5  |..a...x..h-.,...|
  00000020  41 fe aa 91 7d bb 25 e6  4d 24 bb 40 f0 be f9 38  |A...}.%.M$.@...8|
  00000030  f5 29 96 5c 71 8f 07 06  46 d1 ee 32 4f 21 1d d9  |.).\q...F..2O!..|
 },
 ChainHash: (chainhash.Hash) (len=32 cap=32) 0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206,
 ShortChannelID: (lnwire.ShortChannelID) 1746:2:1,
 Timestamp: (uint32) 1671188835,
 MessageFlags: (lnwire.ChanUpdateMsgFlags) 00000001,
 ChannelFlags: (lnwire.ChanUpdateChanFlags) 00000001,
 TimeLockDelta: (uint16) 48,
 HtlcMinimumMsat: (lnwire.MilliSatoshi) 1 mSAT,
 BaseFee: (uint32) 1000,
 FeeRate: (uint32) 200,
 HtlcMaximumMsat: (lnwire.MilliSatoshi) 67500000 mSAT,
 ExtraOpaqueData: (lnwire.ExtraOpaqueData) (len=15 cap=512) {
  00000000  fd 02 31 04 de ad be ef  fd 02 33 03 2a 2a 2a     |..1.......3.***|
 }
})

I believe lnd incorrectly assigns the tlv stream to the channel update, whereas the channel update is length-prefixed so it should be able to figure out that this tlv stream is outside of the channel update and belongs to the failure message.

@joostjager
Copy link
Contributor Author

The encoded TLV looks good. But for inbound fees, an additional channel update is only returned in the case of a fee_insufficient failure. LND does not attach anything extra for temp_channel_failure. If you keep using the 333 tlv record and under-pay Bob's routing fee, you should get the expected failure message.

The other issue I will take a look at, thanks for reporting!

@joostjager
Copy link
Contributor Author

Fixed the invalid channel update issue, fix pushed.

@t-bast
Copy link
Contributor

t-bast commented Dec 16, 2022

If you keep using the 333 tlv record and under-pay Bob's routing fee, you should get the expected failure message.

Cool thanks for the details, I've done that and eclair was correctly able to deserialize the tlv stream sent by lnd 👍

Fixed the invalid channel update issue, fix pushed.

It doesn't seem to be fixed as of 13a7e03
I'm still getting the following error from lnd when sending back a temporary_channel_failure containing a tlv stream:

2022-12-16 14:09:40.108 [WRN] CRTR: Attempt 1003 for payment fc5e53a0094807b1bc26e30a3baf5294b429e9cadacca04cd569a928fe8f906b failed: TemporaryChannelFailure(update=(*lnwire.ChannelUpdate)(0xc0007f40b0)({
 Signature: (lnwire.Sig) (len=64 cap=64) {
  00000000  60 5f 88 0f 54 45 b8 9c  38 6e 40 50 a6 8d 76 7d  |`_..TE..8n@P..v}|
  00000010  09 ac 5d 0c 5f c5 7b f7  b1 62 22 92 cd 25 a8 ef  |..]._.{..b"..%..|
  00000020  6e a4 bf d3 95 5b 3c a3  be 4a e4 4c 91 fe fe a8  |n....[<..J.L....|
  00000030  12 be 20 0d 0a a5 f8 03  2b e3 9a 7d 91 ef fe f7  |.. .....+..}....|
 },
 ChainHash: (chainhash.Hash) (len=32 cap=32) 0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206,
 ShortChannelID: (lnwire.ShortChannelID) 1786:2:1,
 Timestamp: (uint32) 1671195821,
 MessageFlags: (lnwire.ChanUpdateMsgFlags) 00000001,
 ChannelFlags: (lnwire.ChanUpdateChanFlags) 00000000,
 TimeLockDelta: (uint16) 48,
 HtlcMinimumMsat: (lnwire.MilliSatoshi) 1 mSAT,
 BaseFee: (uint32) 1000,
 FeeRate: (uint32) 200,
 HtlcMaximumMsat: (lnwire.MilliSatoshi) 67500000 mSAT,
 ExtraOpaqueData: (lnwire.ExtraOpaqueData) (len=15 cap=512) {
  00000000  fd 02 31 04 de ad be ef  fd 02 33 03 2a 2a 2a     |..1.......3.***|
 }
})
)@1
2022-12-16 14:09:40.121 [ERR] CRTR: Unable to validate channel update: invalid signature for channel update (*lnwire.ChannelUpdate)(0xc0007f40b0)({
 Signature: (lnwire.Sig) (len=64 cap=64) {
  00000000  60 5f 88 0f 54 45 b8 9c  38 6e 40 50 a6 8d 76 7d  |`_..TE..8n@P..v}|
  00000010  09 ac 5d 0c 5f c5 7b f7  b1 62 22 92 cd 25 a8 ef  |..]._.{..b"..%..|
  00000020  6e a4 bf d3 95 5b 3c a3  be 4a e4 4c 91 fe fe a8  |n....[<..J.L....|
  00000030  12 be 20 0d 0a a5 f8 03  2b e3 9a 7d 91 ef fe f7  |.. .....+..}....|
 },
 ChainHash: (chainhash.Hash) (len=32 cap=32) 0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206,
 ShortChannelID: (lnwire.ShortChannelID) 1786:2:1,
 Timestamp: (uint32) 1671195821,
 MessageFlags: (lnwire.ChanUpdateMsgFlags) 00000001,
 ChannelFlags: (lnwire.ChanUpdateChanFlags) 00000000,
 TimeLockDelta: (uint16) 48,
 HtlcMinimumMsat: (lnwire.MilliSatoshi) 1 mSAT,
 BaseFee: (uint32) 1000,
 FeeRate: (uint32) 200,
 HtlcMaximumMsat: (lnwire.MilliSatoshi) 67500000 mSAT,
 ExtraOpaqueData: (lnwire.ExtraOpaqueData) (len=15 cap=512) {
  00000000  fd 02 31 04 de ad be ef  fd 02 33 03 2a 2a 2a     |..1.......3.***|
 }
})

Here is the encoded failure message I'm sending:

6361222ca3f8091416c6bb7097bed80e5be5a88e5010eb461078d944eb978206 (mac)
009d (failure length)
1007008a0102605f880f5445b89c386e4050a68d767d09ac5d0c5fc57bf7b1622292cd25a8ef6ea4bfd3955b3ca3be4ae44c91fefea812be200d0aa5f8032be39a7d91effef706226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f0006fa0000020001639c6cad010000300000000000000001000003e8000000c8000000000405f7e0fd023104deadbeeffd0233032a2a2a (failure)
0063 (padding length)
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 (padding)

Here is the detailed breakdown of the encoded failure:

1007 (temporary_channel_failure)
008a (channel_update length)
0102 (channel_update message type)
605f880f5445b89c386e4050a68d767d09ac5d0c5fc57bf7b1622292cd25a8ef6ea4bfd3955b3ca3be4ae44c91fefea812be200d0aa5f8032be39a7d91effef706226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f0006fa0000020001639c6cad010000300000000000000001000003e8000000c8000000000405f7e0 (channel update)
fd023104deadbeef fd0233032a2a2a (two tlv fields in temporary_channel_failure's tlv stream)

I double-checked the lengths and they seem correct, so lnd should be able to separate the channel_update from the failure's tlv stream, but for some reason it doesn't seem to do it correctly.

@joostjager
Copy link
Contributor Author

Got it, thanks for all those intermediate results. Turns out this bug exists for other messages that contain a channel update as well. Will fix.

@joostjager
Copy link
Contributor Author

Channel update reader fixed in #7262

@joostjager joostjager force-pushed the inbound-fees-signal branch from 13a7e03 to e9bbfd9 Compare January 3, 2023 09:53
@joostjager
Copy link
Contributor Author

@t-bast rebased on top of #7262 to fix the failure tlv parsing bug, as discussed in yesterday's spec meeting.

@t-bast
Copy link
Contributor

t-bast commented Jan 3, 2023

Thanks, I've re-run my cross-compat tests and everything looks good!

@joostjager joostjager force-pushed the inbound-fees-signal branch from e9bbfd9 to d03cef8 Compare January 5, 2023 12:47
@saubyk saubyk added the P1 MUST be fixed or reviewed label Aug 8, 2023
@saubyk saubyk removed this from the Low Priority milestone Aug 8, 2023
@saubyk saubyk added this to the v0.18.0 milestone Aug 10, 2023
return err
}

types, err := tlvStream.DecodeWithParsedTypes(r)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be using DecodeWithParsedTypesP2P to avoid a fake record with a maliciously long length

}

var b bytes.Buffer
if err := f.IncomingUpdate.Encode(&b, pver); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b isn't used again, so can remove these 4 lines?

case *lnwire.FailIncorrectCltvExpiry:
update = &onionErr.Update
case *lnwire.FailChannelDisabled:
update = &onionErr.Update
case *lnwire.FailTemporaryChannelFailure:
if onionErr.Update == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why was this added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, without this a nil update would be appended to the new updates list.

@joostjager
Copy link
Contributor Author

The main open question for this PR is whether we still need a signal from the sender that they support long failure messages. Maybe in the mean time the proliferation of lnd 0.16 (which includes #6913) is sufficient to just assume that support?

@joostjager
Copy link
Contributor Author

To get a better idea about whether the signaling is needed, advertized feature bits of public nodes could be useful. Long failure message incompatibility is purely an lnd issue, other implementations have supported it from the start afaik.

Feature 31 amp is lnd-specific and has been in lnd for a long time already. In the public graph, there are 7036 nodes that have feature 31. Probably very close to the total number of public lnd nodes.

Feature 27 shutdown-any-segwit has been added in lnd 0.15.1. The number of public nodes that advertize both 31 and 27 is 7003.

So it seems that nearly every lnd node on the network is 0.15.1 or above. Unfortunately no new feature bits have been added after that, so it is unclear what the proportion of nodes running 0.16+ is.

@saubyk
Copy link
Collaborator

saubyk commented Nov 27, 2023

Unfortunately no new feature bits have been added after that, so it is unclear what the proportion of nodes running 0.16+ is.

Given that mitigation for replacement recycling attack was made available with version 0.16.1 onwards, can we assume that most lnd nodes on the network would've upgraded?

Also, what would be the impact if a portion of the nodes are not supporting long failure message?

@joostjager
Copy link
Contributor Author

Given that mitigation for replacement recycling attack was made available with version 0.16.1 onwards, can we assume that most lnd nodes on the network would've upgraded?

Sounds reasonable.

Also, what would be the impact if a portion of the nodes are not supporting long failure message?

A non-compatible sender isn't able to properly attribute the error to a (pair of) nodes and will penalize the full route, if I remember it correctly from #6913.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inbound fee Changes related to inbound routing fee P1 MUST be fixed or reviewed routing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants