Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lnwallet: limit received htlc's to MaxAcceptedHTLCs #3910

Merged
merged 3 commits into from
Feb 19, 2020

Conversation

Crypt-iQ
Copy link
Collaborator

This PR clamps down the number of received HTLCs to MaxAcceptedHTLCs.

@Crypt-iQ Crypt-iQ requested a review from Roasbeef January 14, 2020 04:42
@Crypt-iQ Crypt-iQ requested a review from halseth as a code owner January 14, 2020 04:42
@Crypt-iQ Crypt-iQ added this to the 0.9.1 milestone Jan 14, 2020
// Clamp down on the number of HTLC's we can receive. It should never
// exceed our configurable MaxAcceptedHTLCs.
maxHTLCs := lc.channelState.LocalChanCfg.MaxAcceptedHtlcs
remoteACKedIndex := lc.localCommitChain.tail().theirHtlcIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be the index of the last acked htlc, not the first one active, which results in checking the number of unACKed HTLCs, which I don't think is what we want?

What we want to check is the number of active HTLCs, which will include both ACKed and unACKed. I think we must check the update logs or the commitments to see which HTLCs are actually active.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh you're right

Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment

@Crypt-iQ
Copy link
Collaborator Author

PTAL @halseth

@halseth halseth self-requested a review January 20, 2020 12:35
Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯

@halseth
Copy link
Contributor

halseth commented Jan 21, 2020

Travis failing?

@Crypt-iQ
Copy link
Collaborator Author

Travis failing?

timeout flake with TestMppPayment I think

@Crypt-iQ
Copy link
Collaborator Author

Crypt-iQ commented Jan 21, 2020

@halseth Updated the calculation to start from the tip rather than tail since any new adds will be on top of that. Also moved the calculation to a function + updated the unit test.

@halseth
Copy link
Contributor

halseth commented Jan 22, 2020

@halseth Updated the calculation to start from the tip rather than tail since any new adds will be on top of that. Also moved the calculation to a function + updated the unit test.

Yes, makes sense! Looks good 👍

@halseth
Copy link
Contributor

halseth commented Jan 22, 2020

There might be some unexpected implication for the itests, looks like they are all failing the same test now.

@@ -10145,7 +10145,7 @@ func testBidirectionalAsyncPayments(net *lntest.NetworkHarness, t *harnessTest)
}

// Calculate the number of invoices.
numInvoices := int(info.LocalBalance / paymentAmt)
numInvoices := input.MaxHTLCNumber / 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why this didn't fail before?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the new check is on the receive side that could mean we are not properly checking sender side we are not overflowing the commitment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - the last commit is not the correct fix. The sender is overflowing the commitment, but there is a check there. I suspect an off-by-one error somewhere...

@Crypt-iQ Crypt-iQ force-pushed the htlc_add_0113 branch 2 times, most recently from b27d519 to a92f0e9 Compare February 7, 2020 20:38
@Crypt-iQ Crypt-iQ requested a review from halseth February 8, 2020 00:21
Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async bidi test seems happy with latest fix, LGTM 🤘

lnwallet/channel.go Outdated Show resolved Hide resolved
// channel. If, however, we adjust the committed HTLC's count to ignore
// HTLC's which will be removed in the next signature that Alice sends,
// then the add in step 6 won't overflow. This is because the signature
// received in step 7 will cover the settle/fail in step 1 and makes space
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent comments, very helpful!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code to just use validateCommitmentSanity since it was doing the same thing. I didn't feel the need to add a unit test for the specific case outlined above since it's not a new function like calcIncomingHtlcs. Curious what you think.

lnwallet/channel.go Outdated Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
@@ -2788,7 +2788,7 @@ func (lc *LightningChannel) createCommitDiff(
// in the state when adding a new HTLC, or nil otherwise.
func (lc *LightningChannel) validateCommitmentSanity(theirLogCounter,
ourLogCounter uint64, remoteChain bool,
predictAdded *PaymentDescriptor) error {
predictAdded, predictTheirs *PaymentDescriptor) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the godoc should be updated to describe this new argument. alternatively, seems like we can just pass it in via predictAdded and then add it to the appropriate log based on remoteChain

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly we would need this extra argument, to allow the case where we are predicting adding our add to the remote commitment chain, but would need to look at the callsites for the method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes can do this, if remoteChain is true, add to our updates. Otherwise, add to theirs.


// Clamp down on the number of HTLC's we can receive by checking the
// commitment sanity.
err := lc.validateCommitmentSanity(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, nice I think this is it. If I'm not mistaken this will also handle the CASE 2 you outlined earlier, since it will apply the settles/fails accordingly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup it applies the same logic that the sender applies so it handles case 2

@@ -2788,7 +2788,7 @@ func (lc *LightningChannel) createCommitDiff(
// in the state when adding a new HTLC, or nil otherwise.
func (lc *LightningChannel) validateCommitmentSanity(theirLogCounter,
ourLogCounter uint64, remoteChain bool,
predictAdded *PaymentDescriptor) error {
predictAdded, predictTheirs *PaymentDescriptor) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly we would need this extra argument, to allow the case where we are predicting adding our add to the remote commitment chain, but would need to look at the callsites for the method.

Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new unit test looks solid, could use a squash :)

lnwallet/channel_test.go Outdated Show resolved Hide resolved

// Clamp down on the number of HTLC's we can receive by checking the
// commitment sanity.
err := lc.validateCommitmentSanity(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this early validation violate the protocol? For example if we'd receive and settle an htlc in the same batch with no more commitment slot left? If validating only for the whole batch, that would still be possible perhaps?

Copy link
Collaborator Author

@Crypt-iQ Crypt-iQ Feb 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure I follow - the prior behavior was to do no validation and wait until a commitment was received to do the validation. This just makes the validation earlier so there shouldn't be a problem. It also mimics lnd's sender-side logic (to validate upon calling AddHTLC) so if this is a protocol violation, then that logic would be too.

I think it wouldn't be a protocol violation since it will only add an additional HTLC if their next signature for us must have a settle/fail that would remove one HTLC to make room for this new one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the question is indeed if it is a protocol violation. If the commitment is full, should we allow another htlc to be added? Because before the signature comes, a settle may still be added.

Copy link
Collaborator Author

@Crypt-iQ Crypt-iQ Feb 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are two places in the spec I can see where this is elaborated on:

if result would be offering more than the remote's max_accepted_htlcs HTLCs, in the remote commitment transaction:

    MUST NOT add an HTLC.

and

if a sending node adds more than receiver max_accepted_htlcs HTLCs to its local commitment transaction [...]:

    SHOULD fail the channel.

So they only seem concerned about the effect on the commitment transaction. So I think this won't violate the spec since ReceiveHTLC just goes into the update logs. Plus it's only applied to the commit tx later because there is a settle/fail that ensures only the max can be applies to the commit tx.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still feels a little inconclusive. But as the second quote is from the description of the add message, I indeed think that validation should take place already then and not be postponed until the signature arrives.

Copy link
Collaborator Author

@Crypt-iQ Crypt-iQ Feb 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this worth bringing up / making an issue in the spec repo? I don't see the receive logic being the problem, but rather the sending logic. If we think a send in this async case is ok but our peer thinks it's a violation (esp w/ diff implementations), then I'd see a problem as we may be disconnected. All in all though, the receive should match the send logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth bringing up as a spec issue I think, although I think it is pretty obvious that each received message would have to lead to a valid commitment transaction. Otherwise how many updates would you allow while waiting fo a settel to open up the commitment. But agree that this could be more clearly spelled out in the spec.

@Crypt-iQ Crypt-iQ force-pushed the htlc_add_0113 branch 2 times, most recently from 3970197 to b783829 Compare February 15, 2020 14:53
This commit checks the commitment sanity when receiving an HTLC so
that if a commitment transaction will overflow from an ADD, it is
caught earlier rather than in ReceiveNewCommitment.
This commit fixes the TestMaxAcceptedHTLCs, TestMaxPendingAmount,
TestMinHTLC, & TestChanReserve unit tests to pass with the new
ReceiveHTLC logic. Instead of asserting specific failures upon
receiving a new commitment signature, the various assertions were
moved to assert on the error returned from ReceiveHTLC.
Adds a new test which asserts that the new ReceiveHTLC logic can
handle proper commitment overflow calculation in the face of
asynchronous updates.
@halseth
Copy link
Contributor

halseth commented Feb 17, 2020

Noticed travis was failing, but unrelated to this change: #4010

lnwallet/channel.go Show resolved Hide resolved
// Alice sends to Bob in step 6 does not overflow Bob's commitment transaction.
// This is because validateCommitmentSanity counts the HTLC's by ignoring
// HTLC's which will be removed in the next signature that Alice sends. Thus,
// the add won't overflow. This is because the signature received in step 7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this contradict with the conclusion in #3910 (review) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, we can only really execute these heuristics on a best effort basis. There're other classes of the concurrent send case that aren't covered in this test/scenario. At the end of the day, the real defense point is at the sender, in that they should be careful to not send items that would result in a force close.

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✈️

Latest version is much simpler than the prior version I reviewed, which is great! As we dig into this area more and more it becomes more apparent how underspecified the spec w.r.t proper adherence to the current flow control limits. As discussed offline, it may actually take some protocol state machine level changes in order to fully rectify this set of issues.

// Alice sends to Bob in step 6 does not overflow Bob's commitment transaction.
// This is because validateCommitmentSanity counts the HTLC's by ignoring
// HTLC's which will be removed in the next signature that Alice sends. Thus,
// the add won't overflow. This is because the signature received in step 7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, we can only really execute these heuristics on a best effort basis. There're other classes of the concurrent send case that aren't covered in this test/scenario. At the end of the day, the real defense point is at the sender, in that they should be careful to not send items that would result in a force close.

@Roasbeef Roasbeef merged commit 92b79f6 into lightningnetwork:master Feb 19, 2020
@Crypt-iQ Crypt-iQ deleted the htlc_add_0113 branch February 19, 2020 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants