-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lnwallet: limit received htlc's to MaxAcceptedHTLCs #3910
Conversation
8c86cb3
to
f1e84ed
Compare
lnwallet/channel.go
Outdated
// Clamp down on the number of HTLC's we can receive. It should never | ||
// exceed our configurable MaxAcceptedHTLCs. | ||
maxHTLCs := lc.channelState.LocalChanCfg.MaxAcceptedHtlcs | ||
remoteACKedIndex := lc.localCommitChain.tail().theirHtlcIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be the index of the last acked htlc, not the first one active, which results in checking the number of unACKed HTLCs, which I don't think is what we want?
What we want to check is the number of active HTLCs, which will include both ACKed and unACKed. I think we must check the update logs or the commitments to see which HTLCs are actually active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh you're right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment
f1e84ed
to
ac91616
Compare
PTAL @halseth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 💯
Travis failing? |
timeout flake with TestMppPayment I think |
ac91616
to
528c964
Compare
@halseth Updated the calculation to start from the tip rather than tail since any new adds will be on top of that. Also moved the calculation to a function + updated the unit test. |
Yes, makes sense! Looks good 👍 |
There might be some unexpected implication for the itests, looks like they are all failing the same test now. |
lntest/itest/lnd_test.go
Outdated
@@ -10145,7 +10145,7 @@ func testBidirectionalAsyncPayments(net *lntest.NetworkHarness, t *harnessTest) | |||
} | |||
|
|||
// Calculate the number of invoices. | |||
numInvoices := int(info.LocalBalance / paymentAmt) | |||
numInvoices := input.MaxHTLCNumber / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know why this didn't fail before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the new check is on the receive side that could mean we are not properly checking sender side we are not overflowing the commitment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right - the last commit is not the correct fix. The sender is overflowing the commitment, but there is a check there. I suspect an off-by-one error somewhere...
b27d519
to
a92f0e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async bidi test seems happy with latest fix, LGTM 🤘
lnwallet/channel.go
Outdated
// channel. If, however, we adjust the committed HTLC's count to ignore | ||
// HTLC's which will be removed in the next signature that Alice sends, | ||
// then the add in step 6 won't overflow. This is because the signature | ||
// received in step 7 will cover the settle/fail in step 1 and makes space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent comments, very helpful!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the code to just use validateCommitmentSanity
since it was doing the same thing. I didn't feel the need to add a unit test for the specific case outlined above since it's not a new function like calcIncomingHtlcs
. Curious what you think.
lnwallet/channel.go
Outdated
@@ -2788,7 +2788,7 @@ func (lc *LightningChannel) createCommitDiff( | |||
// in the state when adding a new HTLC, or nil otherwise. | |||
func (lc *LightningChannel) validateCommitmentSanity(theirLogCounter, | |||
ourLogCounter uint64, remoteChain bool, | |||
predictAdded *PaymentDescriptor) error { | |||
predictAdded, predictTheirs *PaymentDescriptor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the godoc should be updated to describe this new argument. alternatively, seems like we can just pass it in via predictAdded
and then add it to the appropriate log based on remoteChain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly we would need this extra argument, to allow the case where we are predicting adding our
add to the remote
commitment chain, but would need to look at the callsites for the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes can do this, if remoteChain
is true, add to our updates. Otherwise, add to theirs.
|
||
// Clamp down on the number of HTLC's we can receive by checking the | ||
// commitment sanity. | ||
err := lc.validateCommitmentSanity( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, nice I think this is it. If I'm not mistaken this will also handle the CASE 2
you outlined earlier, since it will apply the settles/fails accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup it applies the same logic that the sender applies so it handles case 2
lnwallet/channel.go
Outdated
@@ -2788,7 +2788,7 @@ func (lc *LightningChannel) createCommitDiff( | |||
// in the state when adding a new HTLC, or nil otherwise. | |||
func (lc *LightningChannel) validateCommitmentSanity(theirLogCounter, | |||
ourLogCounter uint64, remoteChain bool, | |||
predictAdded *PaymentDescriptor) error { | |||
predictAdded, predictTheirs *PaymentDescriptor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly we would need this extra argument, to allow the case where we are predicting adding our
add to the remote
commitment chain, but would need to look at the callsites for the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new unit test looks solid, could use a squash :)
|
||
// Clamp down on the number of HTLC's we can receive by checking the | ||
// commitment sanity. | ||
err := lc.validateCommitmentSanity( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this early validation violate the protocol? For example if we'd receive and settle an htlc in the same batch with no more commitment slot left? If validating only for the whole batch, that would still be possible perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure I follow - the prior behavior was to do no validation and wait until a commitment was received to do the validation. This just makes the validation earlier so there shouldn't be a problem. It also mimics lnd's sender-side logic (to validate upon calling AddHTLC
) so if this is a protocol violation, then that logic would be too.
I think it wouldn't be a protocol violation since it will only add an additional HTLC if their next signature for us must have a settle/fail that would remove one HTLC to make room for this new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the question is indeed if it is a protocol violation. If the commitment is full, should we allow another htlc to be added? Because before the signature comes, a settle may still be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there are two places in the spec I can see where this is elaborated on:
if result would be offering more than the remote's max_accepted_htlcs HTLCs, in the remote commitment transaction:
MUST NOT add an HTLC.
and
if a sending node adds more than receiver max_accepted_htlcs HTLCs to its local commitment transaction [...]:
SHOULD fail the channel.
So they only seem concerned about the effect on the commitment transaction. So I think this won't violate the spec since ReceiveHTLC
just goes into the update logs. Plus it's only applied to the commit tx later because there is a settle/fail that ensures only the max can be applies to the commit tx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still feels a little inconclusive. But as the second quote is from the description of the add message, I indeed think that validation should take place already then and not be postponed until the signature arrives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this worth bringing up / making an issue in the spec repo? I don't see the receive logic being the problem, but rather the sending logic. If we think a send in this async case is ok but our peer thinks it's a violation (esp w/ diff implementations), then I'd see a problem as we may be disconnected. All in all though, the receive should match the send logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth bringing up as a spec issue I think, although I think it is pretty obvious that each received message would have to lead to a valid commitment transaction. Otherwise how many updates would you allow while waiting fo a settel to open up the commitment. But agree that this could be more clearly spelled out in the spec.
3970197
to
b783829
Compare
This commit checks the commitment sanity when receiving an HTLC so that if a commitment transaction will overflow from an ADD, it is caught earlier rather than in ReceiveNewCommitment.
This commit fixes the TestMaxAcceptedHTLCs, TestMaxPendingAmount, TestMinHTLC, & TestChanReserve unit tests to pass with the new ReceiveHTLC logic. Instead of asserting specific failures upon receiving a new commitment signature, the various assertions were moved to assert on the error returned from ReceiveHTLC.
Adds a new test which asserts that the new ReceiveHTLC logic can handle proper commitment overflow calculation in the face of asynchronous updates.
b783829
to
5a5e095
Compare
Noticed travis was failing, but unrelated to this change: #4010 |
// Alice sends to Bob in step 6 does not overflow Bob's commitment transaction. | ||
// This is because validateCommitmentSanity counts the HTLC's by ignoring | ||
// HTLC's which will be removed in the next signature that Alice sends. Thus, | ||
// the add won't overflow. This is because the signature received in step 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this contradict with the conclusion in #3910 (review) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end, we can only really execute these heuristics on a best effort basis. There're other classes of the concurrent send case that aren't covered in this test/scenario. At the end of the day, the real defense point is at the sender, in that they should be careful to not send items that would result in a force close.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Latest version is much simpler than the prior version I reviewed, which is great! As we dig into this area more and more it becomes more apparent how underspecified the spec w.r.t proper adherence to the current flow control limits. As discussed offline, it may actually take some protocol state machine level changes in order to fully rectify this set of issues.
// Alice sends to Bob in step 6 does not overflow Bob's commitment transaction. | ||
// This is because validateCommitmentSanity counts the HTLC's by ignoring | ||
// HTLC's which will be removed in the next signature that Alice sends. Thus, | ||
// the add won't overflow. This is because the signature received in step 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end, we can only really execute these heuristics on a best effort basis. There're other classes of the concurrent send case that aren't covered in this test/scenario. At the end of the day, the real defense point is at the sender, in that they should be careful to not send items that would result in a force close.
This PR clamps down the number of received HTLCs to
MaxAcceptedHTLCs
.