Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to pay extreme multipart payment in tests #7605

Closed
Lagrang3 opened this issue Aug 23, 2024 · 3 comments · Fixed by #7606
Closed

Unable to pay extreme multipart payment in tests #7605

Lagrang3 opened this issue Aug 23, 2024 · 3 comments · Fixed by #7606

Comments

@Lagrang3
Copy link
Collaborator

Lagrang3 commented Aug 23, 2024

As a follow up to issue #7563, I have tested renepay on the same topology that @daywalker90 proposed
for testing getroutes. It turned out the payment failed with failed to find a feasible flow
due to a sequence of HTLC failures on a remote channel that should have had enough liquidity.

To reproduce the problem I tried the following test using only sendpay.
The payment flow is: l1 -> l2 -> l3, where there are more than one channels connecting l1->l2
and l2->l3.

When trying to send a 4 part payment with the following routes

  • part 1 with 350k in the route l1->l2->l3 over a couple of channels with capacity 400k,
  • part 2 with 250k in the route l1->l2->l3 over a couple of channels with capacity 300k,
  • part 3 with 150k in the route l1->l2->l3 over a couple of channels with capacity 200k,
  • part 4 with 50k in the route l1->l2->l3 over a couple of channels with capacity 100k.

One or more payment parts always fail at the l2->l3 hop with CHANNEL_ERR_CHANNEL_CAPACITY_EXCEEDED seen
at l2 logs.

Another simpler case that fails as well is:

  • part 1 with 200k in the route l1->l2->l3 over a couple of channels with capacity 400k,
  • part 2 with 200k in the route l1->l2->l3 over a couple of channels with capacity 300k.

If instead I try making a single part payment:

  • 350k in the route l1->l2->l3 over a couple of channels with capacity 400k.
    The attempt succeeds.
def get_local_channel_by_id(node, chanid):
    peerchannels = node.rpc.listpeerchannels()['channels']
    if not peerchannels:
        return None
    for c in peerchannels:
        if c['channel_id']==chanid:
            return c
    return None

def start_channels(connections):
    nodes = list()
    for src, dst, fundamount in connections:
        nodes.append(src)
        nodes.append(dst)
        src.rpc.connect(dst.info["id"], "localhost", dst.port)

    bitcoind = nodes[0].bitcoin
    # If we got here, we want to fund channels
    for src, dst, fundamount in connections:
        addr = src.rpc.newaddr()["bech32"]
        bitcoind.rpc.sendtoaddress(addr, (fundamount + 1000000) / 10**8)

    bitcoind.generate_block(1)
    sync_blockheight(bitcoind, nodes)
    txids = []
    chan_ids = []
    for src, dst, fundamount in connections:
        reply = src.rpc.fundchannel(dst.info["id"], fundamount, announce=True, minconf=0)
        txids.append(reply["txid"])
        chan_ids.append(reply["channel_id"])

    # Confirm all channels and wait for them to become usable
    bitcoind.generate_block(1, wait_for_mempool=txids)
    scids = []
    for con, mychan_id in zip(connections, chan_ids):
        src = con[0]
        wait_for(lambda: get_local_channel_by_id(src, mychan_id)['state'] == "CHANNELD_NORMAL")
        scids.append(get_local_channel_by_id(src, mychan_id)['short_channel_id'])

    # Make sure they have all seen block so they don't complain about
    # the coming gossip messages
    sync_blockheight(bitcoind, nodes)
    bitcoind.generate_block(5)

    # Make sure everyone sees all channels, all other nodes
    for n in nodes:
        for scid in scids:
            n.wait_channel_active(scid)

    # Make sure we have all node announcements, too
    for n in nodes:
        for n2 in nodes:
            wait_for(
                lambda: "alias" in only_one(n.rpc.listnodes(n2.info["id"])["nodes"])
            )
    return chan_ids

def test_channel_reserve(node_factory):
    def direction(node1, node2):
        return 0 if node1.info['id']<node2.info['id'] else 1
    
    opts = {"disable-mpp": None, "fee-base": 0, "fee-per-satoshi": 0, "cltv-delta": 6}
    l1, l2, l3 = node_factory.get_nodes(3, opts=opts)
    
    chan_ids = start_channels(
        [
            (l1, l2, 400_000),
            (l2, l3, 400_000),
            (l1, l2, 300_000),
            (l2, l3, 300_000),
            (l1, l2, 200_000),
            (l2, l3, 200_000),
            (l1, l2, 100_000),
            (l2, l3, 100_000),
        ]
    )
    
    # we should be able to send these four parts:
    # nparts = 4
    # route_amounts = ["350000sat", "250000sat", "150000sat", "50000sat"]
    # instead the test fails even with these:
    nparts = 2
    route_amounts = ["201000sat", "200000sat", "150000sat", "50000sat"]
    total_msat = sum( [ Millisatoshi(a) for a in route_amounts[:nparts] ])
    
    # Test succeeds if we are able to pay this invoice
    inv = l3.rpc.call("invoice", {"amount_msat": total_msat,
        "label":"inv","description":"inv","cltv":10})
    inv_decode = l1.rpc.decode(inv["bolt11"])
    
    # Shared data by every route we will construct: l1->l2->l3
    route = [{"id": l2.info["id"],
        "direction": direction(l1,l2),
        "delay": 16,
        "style": "tlv"},
        {"id": l3.info["id"],
        "direction": direction(l2,l3),
        "delay": 10,
        "style": "tlv"}]

    # Send every part with sendpay
    for part in range(nparts):
        this_part_msat = Millisatoshi(route_amounts[part])
        chan1 = get_local_channel_by_id(l1, chan_ids[part*2])
        chan2 = get_local_channel_by_id(l2, chan_ids[part*2+1])
        
        route[0]["channel"] = chan1["short_channel_id"]
        route[1]["channel"] = chan2["short_channel_id"]
        route[0]["amount_msat"] = route[1]["amount_msat"] = this_part_msat
        
        assert chan1["spendable_msat"]>=this_part_msat
        assert chan2["spendable_msat"]>=this_part_msat
        
        l1.rpc.call(
            "sendpay",
            {"route": route,
             "payment_hash": inv["payment_hash"],
             "payment_secret": inv["payment_secret"],
             "amount_msat": total_msat,
             "groupid": 1,
             "partid": part+1,},
        )
    l1.wait_for_htlcs()
    
    # Are we happy?
    waitsendpay_response = l1.rpc.call("waitsendpay",{"payment_hash":
        inv["payment_hash"], "partid": 1, "groupid": 1})
    receipt = only_one(l3.rpc.listinvoices("inv")["invoices"])
    assert receipt["status"] == "paid"
    assert receipt["amount_received_msat"] == total_msat
@Lagrang3
Copy link
Collaborator Author

Lagrang3 commented Aug 23, 2024

I think the problem is with the parallel channels. On l2 channeld is only seeing one channel with his peer l1.

I can see that lightningd's peer structure has a list of channels (defined in lightningd/peer_control.h), but it seems that channeld's peer structure has only one channel (defined in channeld/channeld.c). Keep looking...

UPD: There's one channeld instance for every channel. So maybe lightningd is sending the HTLC request to the wrong subdaemon?

As a matter of fact in the logs:

035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-channeld-chan#2: Adding HTLC 0 amount=501000000msat cltv=119 gave CHANNEL_ERR_ADD_OK
...
035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-channeld-chan#2: Adding HTLC 1 amount=500000000msat cltv=119 gave CHANNEL_ERR_CHANNEL_CAPACITY_EXCEEDED

both HTLC requests go to the same channeld-chan#2, one should have gone to channeld-chan#2 and the other to channeld-chan#4.

@Lagrang3
Copy link
Collaborator Author

There is a channel selection in lightningd before forwarding:

best = best_channel(ld, peer, p->amt_to_forward, c);

Is that a good idea, why wouldn't we try to forward right on the channels we have been requested to?

Clearly in this case that function is not selecting the best channel.

@Lagrang3
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant