Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Dont_Forward bit in Channel Update Message #7759

Closed

Conversation

ziggie1984
Copy link
Collaborator

@ziggie1984 ziggie1984 commented Jun 12, 2023

Fixes #7416

Change Description

According to the Specification we must set the dont_forward bit when we send a channelupdate msg to the peer.
With this change we make sure that our local channel updates for unannounced channels have this bit set.
Moreover we are performing a migration on the fly for all old unannounced channels as soon as the channel policy is updated.

We currently do not use this flag for our own decisions when forwarding a channel_update to the broader network (we use the AuthProof instead but we have to introduce this change to be spec. compliant.

Currently no new tests where added. Tests in the gossiper_test.go are not affected by this change because the bit is set on higher levels. But open for discussions regarding more tests.

Steps to Test

Steps for reviewers to follow to test the change.

Pull Request Checklist

Testing

  • Your PR passes all CI checks.
  • Tests covering the positive and negative (error paths) are included.
  • Bug fixes contain tests triggering the bug to prevent regressions.

Code Style and Documentation

📝 Please see our Contribution Guidelines for further guidance.


This change is Reviewable

@ziggie1984 ziggie1984 force-pushed the dont-forward-bit branch 3 times, most recently from 8530b00 to aaf3da6 Compare June 12, 2023 11:03
@ziggie1984 ziggie1984 marked this pull request as ready for review June 12, 2023 11:13
@saubyk saubyk requested a review from hieblmi June 12, 2023 18:22
lnwire/channel_update.go Outdated Show resolved Hide resolved
lnwire/channel_update.go Outdated Show resolved Hide resolved
funding/manager.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@hieblmi hieblmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ziggie1984 :-), thanks for picking up this feature! LGTM, just a few nits and questions.

Also re the test, I think in general it would be great to get new features covered by a test. Only in case it makes sense here, could you maybe add a check in a private channel test or a scid-alias test to confirm the dont_forward bit is set/unset in a public channel?

@ziggie1984 ziggie1984 force-pushed the dont-forward-bit branch 2 times, most recently from aaa1128 to 48f9fe8 Compare June 13, 2023 07:21
@ziggie1984
Copy link
Collaborator Author

@hieblmi thanks for taking a look, incorporated your comments. Agree with the tests, actually the failing unit-tests and itests forced me to update some test procedures to keep them running through and by this we test this new behaviour as well.

discovery/gossiper.go Outdated Show resolved Hide resolved
funding/manager.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@hieblmi hieblmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @ziggie1984, thanks for the quick turnaround. Only one minor nit.

@saubyk saubyk requested review from ProofOfKeags and Crypt-iQ June 15, 2023 15:24
@saubyk saubyk added this to the v0.17.0 milestone Jun 15, 2023
Copy link
Collaborator

@ProofOfKeags ProofOfKeags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments left in-line

Comment on lines 3119 to 3122
announcedChan := completeChan.ChannelFlags&lnwire.FFAnnounceChannel != 0
if peerAlias != nil {
announcedChan = false
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a a reason that this isn't:

Suggested change
announcedChan := completeChan.ChannelFlags&lnwire.FFAnnounceChannel != 0
if peerAlias != nil {
announcedChan = false
}
announcedChan := completeChan.ChannelFlags&lnwire.FFAnnounceChannel != 0 && peerAlias == nil

it seems like this more clearly captures the semantics.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was doing it because of the readability of the code, the 80 character limit would lead to something like this:

announcedChan := completeChan.ChannelFlags&
		lnwire.FFAnnounceChannel != 0 && peerAlias == nil

But ok to change if that's better?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fewer times we reassign a variable the better imo. If you break it up put the second boolean condition on a new line rather than breaking the mask operation across lines.

Comment on lines +219 to +226
if edgeInfo.AuthProof == nil {
edge.MessageFlags |= lnwire.ChanUpdateDontForward
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to guarantee that if we don't have an AuthProof that we will set the ChanUpdateDontForward bit, but does not specify in any way what should happen if the AuthProof is present. Do we always want to leave it in whatever state it is currently in, or do we want to unset that bit if the AuthProof is present? Secondly, is the AuthProof always absent when the channel is unannounced? If so, is this invariant captured anywhere? It seems unintuitive to me that those two ideas are so tightly linked. However, if we can guarantee that that invariant will hold, then this seems like a safe enough way to do this.

Copy link
Collaborator Author

@ziggie1984 ziggie1984 Jun 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Catch, I need to add a testcase somewhere where we switch from unannounced to announced (zero-conf public channel). Because the current code would just have added the same edgeinfo as before including the msg flag (retrieving it from the db).

Lnd makes sure that we only have a valid proof specified when the channel was announced to the network so I think it is even stricter than just looking at the dont_forward bit.

Copy link
Collaborator

@ProofOfKeags ProofOfKeags Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah judging by some of the other parts of this PR, it seems that the AuthProof is a necessary but not sufficient condition for forwarding. We must also have an absence of the dont_forward bit. I think we need to mentally decouple the idea of the presence of the AuthProof and the decision to forward. I do not believe that zero-conf public channels will be the last time we see these ideas differ from one another.

@ziggie1984 ziggie1984 force-pushed the dont-forward-bit branch 3 times, most recently from 71d6847 to 590b811 Compare June 21, 2023 13:09
@ziggie1984
Copy link
Collaborator Author

So I reevaluated the situation and fixed the problem which @ProofOfKeags pointed out (whether we need to set the bit in case we have a channel proof). There is a situation where a prior private channel can be switched to public and this is a zeroconf public channel, which is private before 6 confs and later published to the network. In this case the prior codebase made sure we remove the old entry in the channel graph and add it under the real ShortChannel ID. But we did just use the old EdgePolicyInfo so that would have been a bug. Now I implemented it in a way where the MessageFlags are not overwritten.

However still some things I am indecisive about:

  1. I added a Log-Warning message in case we receive a channelupdate msg with the dont_forward bit set although this channel has a AuthProof. So question do we need this?

  2. I added a unit-test related to the public zeroconf channel and basically test that we do not overwrite the MessageFlags. Do you think this is worth it? I also contemplated adding an itest where we check for the MessageFlags in the ChannelUpdate message via the SubscribeToGraph RPC call. There I would have to add the MessageFlags into the EdgeInfoAnnouncement to compare it. I found it not worth it but happy to discuss it.

With that said looking forward to your final conclusion.

Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR🙏

Mostly questions, also have a question - do we use dont_forward bit anywhere else other than private channels?

@@ -3290,7 +3290,7 @@ func (f *Manager) extractAnnounceParams(c *channeldb.OpenChannel) (
func (f *Manager) addToRouterGraph(completeChan *channeldb.OpenChannel,
shortChanID *lnwire.ShortChannelID,
peerAlias *lnwire.ShortChannelID,
ourPolicy *channeldb.ChannelEdgePolicy) error {
ourPolicy *channeldb.ChannelEdgePolicy, shouldAnnounce bool) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we can remove this new param and do the check inside this method?

	shouldAnnounce := completeChan.ChannelFlags&lnwire.FFAnnounceChannel == 1 &&
		completeChan.NumConfsRequired >= 6

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that the network requires 6 confs and doesn't simply allow it to be accepted by the acceptor's minimum_depth 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled the check out for clarity but also ok with putting it back into the addToRouterGraph function (see #7759 (comment)), but also open to change it back.

// can be usable before six confirmations but we still keep the channel
// unannounced even for public channels until six confirmations are
// reached.
shouldAnnounce := channel.ChannelFlags&lnwire.FFAnnounceChannel == 1 &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do this check inside addToRouterGraph?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above.

@@ -4112,6 +4128,10 @@ func (f *Manager) newChanAnnouncement(localPubKey,
// max_htlc field.
msgFlags := lnwire.ChanUpdateRequiredMaxHtlc

if !shouldAnnounce {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually even further I'm wondering if we could use ourPolicy.MessageFlags here, but this won't be feasible unless we also modify storedFwdingPolicy, maybe a future refactor PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm not sure, so the ourPolicy flag is only not nil, it the channel is a ZeroConf channel to make sure we persist the policies which were present before the 6 confirmations. And because the msg flag bits might change see comment below, we cannot use the ourPolicy variable ?

// We are not reusing the MessageFlags because a channel might
// change from unannounced to announced during its lifetime.

})
// Only add the ChanUpdate to the gossiper if the
// __dont_forward__ bit signals it.
if !e1Ann.MessageFlags.HasDontForward() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused about this change - so I guess when we must be the participant of the channel when receiving this channel_announcement? Otherwise it means we received channel_update which shouldn't be forwarded in the first place since they have dont_forward bit set?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I can make sense of this. This happens in the scenario where we are processing a rejected edge. The only time we add an edge to our graph that we didn't validate the AuthProof for is if we are one of the parties on the channel. I'm speculating because I'm still new to this segment of the codebase but it seems that this only happens when we add the edge prior to announcement_signatures and then possibly process the remote's channel_update prior to a valid channel_announcement which contains the proof. So in this scenario it actually stands to reason that this bit should always be set here, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good questions!
So as it is coded currently, both lnd peers will readd the channel after 6 confirmations are reached with the _dont_forward_bit unset here: https://github.com/lightningnetwork/lnd/pull/7759/files#diff-bbccb27bb17be513ad84c77ad8e37a4d76168956f7978309aab29a0cbfdf91acR3487 (for Scid-channels it works a bit different but the channel is readded as well => hence a new ChanUpdate is sent (directly to the peer) so we anticipate the dont_forward bit not set in the above case. However we can not be sure whether the peer ratelimited us or something and the ChanUpdate did not make it into its graph. So thats why I integrated this check to make sure we follow the network rules and don't gossip a ChanUpdate with the bit set, potentially getting punished by other peers for not sticking to the rules?


// Only add the ChanUpdate to the gossiper if the
// __dont_forward__ bit signals it.
if !e2Ann.MessageFlags.HasDontForward() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: do e2 and e1 always have the same message flags?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in this case since we are processing a rejected edge, the only case where this happens is one where these flags are set, or even if they aren't, they ought to be.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know for sure, depending on how strict the peer rate-limits us, but in the best case scenario we should have the same message bits set here. I think it makes sense to check for the bit nonetheless just in case we somehow did not update the Graph-Data and do not broadcast a ChanUpdate which was never met to be broadcasted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code seems fine as-is, but can you elaborate -- what does the peer rate-limiting have to do with this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was referring to the rate limiting because we consider a ChanUpdate as stale if it has not a greater timestamp than the prior one. (even if the content differ). This had effects in the itests where blocks were mined quickly and the time between block 3 and block 6 was lower than 1 sec, and when we or our peer announced the channel after 6 blocks it had happened that the old ChanUpdate from our Peer was still in our db. That should never happen in the real world but thats why this check is also important here (at least for the itests).

@@ -2271,6 +2282,16 @@ func IsKeepAliveUpdate(update *lnwire.ChannelUpdate,
if update.MessageFlags.HasMaxHtlc() && !prev.MessageFlags.HasMaxHtlc() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this line can be removed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure also to remain backwards compatible ? There might be old nodes out there which have not updated to the latest release where the MaxHTLC flag became mandatory so we don't want to consider those ChanUpdates as keepalive msgs ?

log.Debugf("Update edge for short_chan_id(%v) got: %v",
shortChanID, err)
} else if strings.Contains(err.Error(),
channeldb.ErrEdgeNotFound.Error()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do this here as it creates a possible DoS since it bypasses the isRecentlyRejectedMsg check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right this is something I still not figured out (referring to the race condition here), I don't think its a DoS vector tho because the case where we do not find the corresponding edge is already covered here:

case channeldb.ErrEdgeNotFound:

The problem why we need to check for this case here, is that somehow for Alias channels we remove and add the edge after six confirmations and we might run into the case where we are receiving the new ChanUpdate (with the dont_foward bit unset) and also removing the edge leading to the case where the itests break. I guess I need to make sure that this call :

ourPolicy, err := f.cfg.DeleteAliasEdge(baseScid)

and the call here:

if err := d.cfg.Router.UpdateEdge(update, ops...); err != nil {

do not interfere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a separate commit with step-by-step info in the commit body. One TODO was to make the deletion + addition of edges atomic. That could be a separate PR that lands before this one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, I separated this commit and added a detailed commit body. I also included a TODO so that depending on whether this separate PR which makes the deletion+addition atomic lands before of after this change we can just delete it again.
However I am also in favour of adding a separate PR before this one and removing this commit altogether because I don't think we need it if its atomic. I will create an issue or a PR directly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this PR should land until atomic deletion and addition is done, just to keep things simple and not have band-aids in our code

// Make sure Alice which is not a direct peer will see this channnel
// as well after it is confirmed. A public zeroconf channel is only
// announced to the broader network after six confirmations.
ht.AssertTopologyChannelOpen(ht.Alice, fundingPoint3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think this change is necessary - if we wanna test relaying gossip we should test it in graph tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok will remove

// after 6 confirmations are reached is not considered stale. This
// ChanUpdate is important because it unsets the __dont_forward__ bit
// signaling the broadcasting to the network.
time.Sleep(time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to sleep here? If the channel updates have diffs they shouldn't be considered as stale?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the problem which manifests in the itests lies here:

lnd/discovery/gossiper.go

Lines 2721 to 2732 in 14a6f53

if d.cfg.Router.IsStaleEdgePolicy(
graphScid, timestamp, upd.ChannelFlags,
) {
log.Debugf("Ignored stale edge policy for short_chan_id(%v): "+
"peer=%v, msg=%s, is_remote=%v", shortChanID,
nMsg.peer, nMsg.msg.MsgType(), nMsg.isRemote,
)
nMsg.err <- nil
return nil, true
}

our timestamp granularity is 1 second, now I introduced the Change where we send a ChanUpdate after 3 blocks(if default value is active) but also after 6 blocks with the dont_forward bit unset, because in the itests blocks are mined very fast, we hit exactly the case where the ChanUpdates have the exact same timestamp. We did not see this effect prior to this change tho. So to catch this case in the itests, we have to include a check in the isStaleEdgePolicy that ChanUpdates which are different but have the same timestamp are not considered stale ? Wondering if this could have some side effects ?

Comment on lines +3197 to +3198
// channel has been announced. We anticipate the local ChanAnnouncement
// and the local ChanUpdate.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clarification here. I do wonder if instead of just counting the messages though that we should actually check that both of those distinct messages appear

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea will add the msg check.

})
// Only add the ChanUpdate to the gossiper if the
// __dont_forward__ bit signals it.
if !e1Ann.MessageFlags.HasDontForward() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I can make sense of this. This happens in the scenario where we are processing a rejected edge. The only time we add an edge to our graph that we didn't validate the AuthProof for is if we are one of the parties on the channel. I'm speculating because I'm still new to this segment of the codebase but it seems that this only happens when we add the edge prior to announcement_signatures and then possibly process the remote's channel_update prior to a valid channel_announcement which contains the proof. So in this scenario it actually stands to reason that this bit should always be set here, right?


// Only add the ChanUpdate to the gossiper if the
// __dont_forward__ bit signals it.
if !e2Ann.MessageFlags.HasDontForward() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in this case since we are processing a rejected edge, the only case where this happens is one where these flags are set, or even if they aren't, they ought to be.

Comment on lines +2285 to +2294
// Every public channel starts with the dont_forward bit set before the
// mandatory six confirmations are reached. We will see 2 different
// ChanUpdates before a public channel is announced to the broader
// network. We need to make sure we don't consider the second one as a
// keepalive msg.
if !update.MessageFlags.HasDontForward() &&
prev.MessageFlags.HasDontForward() {

return false
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

@@ -3290,7 +3290,7 @@ func (f *Manager) extractAnnounceParams(c *channeldb.OpenChannel) (
func (f *Manager) addToRouterGraph(completeChan *channeldb.OpenChannel,
shortChanID *lnwire.ShortChannelID,
peerAlias *lnwire.ShortChannelID,
ourPolicy *channeldb.ChannelEdgePolicy) error {
ourPolicy *channeldb.ChannelEdgePolicy, shouldAnnounce bool) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that the network requires 6 confs and doesn't simply allow it to be accepted by the acceptor's minimum_depth 🤔

funding/manager.go Outdated Show resolved Hide resolved
// unannounced even for public channels until six confirmations are
// reached.
shouldAnnounce := channel.ChannelFlags&lnwire.FFAnnounceChannel == 1 &&
channel.NumConfsRequired >= 6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumConfsRequired is the min accept depth setting, but the channel may have less than 6 confirmations at this point. This means that we'll announce the channel w/o the dont_forward bit before 6 confs have been reached

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, I forgot about the zero-conf public channels.

// edge is persisted to disk later when propagating the new channel
// update.
if edgeInfo.AuthProof == nil {
edge.MessageFlags |= lnwire.ChanUpdateDontForward
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now since addToRouterGraph is now called in annAfterSixConfs


// Only add the ChanUpdate to the gossiper if the
// __dont_forward__ bit signals it.
if !e2Ann.MessageFlags.HasDontForward() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code seems fine as-is, but can you elaborate -- what does the peer rate-limiting have to do with this?

log.Debugf("Update edge for short_chan_id(%v) got: %v",
shortChanID, err)
} else if strings.Contains(err.Error(),
channeldb.ErrEdgeNotFound.Error()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a separate commit with step-by-step info in the commit body. One TODO was to make the deletion + addition of edges atomic. That could be a separate PR that lands before this one.

@ziggie1984 ziggie1984 force-pushed the dont-forward-bit branch 2 times, most recently from 9748760 to 04eb1f0 Compare October 3, 2023 13:44
log.Debugf("Update edge for short_chan_id(%v) got: %v",
shortChanID, err)
} else if strings.Contains(err.Error(),
channeldb.ErrEdgeNotFound.Error()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this PR should land until atomic deletion and addition is done, just to keep things simple and not have band-aids in our code

// reached. Public zero-conf channels are also not announced to the
// broader network until six confirmations have been reached.
shouldAnnounce := channel.ChannelFlags&lnwire.FFAnnounceChannel == 1 &&
channel.NumConfsRequired >= 6 && peerAlias == nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this only be done when the actual number of confirmations is >= 6? NumConfsRequired is min_accept_depth, not the number of confirmations that the funding tx has

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree maybe not that intuitive, but for normal channels we only execute this function if the channel has reached the minimum_accepted depth or am I misunderstanding something ?
Nonetheless I will base it on mined blocks rather than the min_accepted_height

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, this should probably have a more clear comment explicitly saying that 6 confs have passed in this case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I decided to go with the check as it is and added a more detailed comment. Fetching chain data here would overcomplicate things in my view because we would have to fetch chain data here (introducing some level of indirection) wdyt ?

@ziggie1984 ziggie1984 force-pushed the dont-forward-bit branch 2 times, most recently from b44fc60 to 0bee285 Compare October 4, 2023 15:42
Set the dont_forward bit in the chan update msg  when a channel
is not (yet) announced to the network. This is either the case
for channels which signal this in the ChannelFlags of the
OpenChannel msg but also for public channels which have still not
reached the minimum depth height of 6 confirmations.
Adopt tests to make sure the dont_forward bit is set
when needed.
When a channel update is now executed the msg field will be
migrated on the fly including the dont_forward bit.
In case the __dont_forward__ is set in the ChanUpdate msg lnd
will not forward this msg to the gossiper hence broadcasting
it to all its peers. This makes sure we follow the network rules
and dont forward ChanUpdates which do not signal it.
This pr introduced a change in the ChanUpdate msg. We now
explictily signal to our peer whether he should broadcast our msg to
the broader network. So it is important that we receive a new
ChanUpdate msg from our peer after the six confirmations are reached
and we should broadcast the existence of this channel to the broader
network.
However there is a race condition for non-zero-conf public alias
channels for now because the channel edge is removed+added after six
confirmations to make sure we do not broadcast any ChanUpdate which
uses the scid instead of the real short channel id. During the removal
and the addition of the non-zero-conf public alias channel we might
receive an update for this channel (because our peer was
faster readding the channel in its db and sending out the new
ChanUpdate using the confirmed-scid). If we did not catch this edge
case messages from this peer would be rate limited and the new
ChanUpdate msg where the __dont_forward__ bit is unset would not be
received in time.
This edge case should be removed as soon as we implement
atomic deleting and addition of channel edges (for alias channels).
We test that a public zeroconf channel is gossiped through the
network after 6 confirmation.

Moreover itests had to be adopted because we are now sending an
additional ChanUpdate msg for normal public channels signaling
the broadcasting to the broader network.
@ziggie1984
Copy link
Collaborator Author

I analysed the itests which were failing occasionally and could find the problem. So because we now send the ChanUpdate after six_confirmation again, for slower backends we run into the problem where 2 consecutive ChanUpdates are sent in the itests very fast. First one here:
https://github.com/lightningnetwork/lnd/pull/7759/files#diff-bbccb27bb17be513ad84c77ad8e37a4d76168956f7978309aab29a0cbfdf91acR3487-R3490

and the second one here after the Proofs of the peer are received:
https://github.com/lightningnetwork/lnd/pull/7759/files#diff-1eda595bbebe495bd74a6a0431c46b66cb4e8b53beb311067c010feac2665dcbR3350-R3356

This can happen so fast that our check here:

lnd/discovery/gossiper.go

Lines 2721 to 2732 in 42dfa2c

if d.cfg.Router.IsStaleEdgePolicy(
graphScid, timestamp, upd.ChannelFlags,
) {
log.Debugf("Ignored stale edge policy for short_chan_id(%v): "+
"peer=%v, msg=%s, is_remote=%v", shortChanID,
nMsg.peer, nMsg.msg.MsgType(), nMsg.isRemote,
)
nMsg.err <- nil
return nil, true
}

fails to distinguish those two consecutive ChanUpdates not marking the latter as stale (because the db is not updated fast enough with the newest update). So for itests where we test the rate-limiting this can lead to some effects where we rate-limit too much and therefore not recognizing the channel update policies.

I am not sure whether we want to accept this because a lot of people could experience failed itests in their PRs solely based on this behaviour above.

Happy to hear your conclusion.


Apart from that this PR is in stall mode currently and waiting for a PR which makes the deletion and addition of edges atomic (basically a new command maybe called readd_edge). This command is needed because for alias channels we have to readd the edge to make sure we do not propagate any ChanUpdates to the broader network using the alias instead of the real_chan_id. Doing this not atomic can lead to side-effects observed during the current PR.

@lightninglabs-deploy
Copy link

@hieblmi: review reminder
@saubyk: review reminder
@ziggie1984, remember to re-request review from reviewers when ready

@ziggie1984
Copy link
Collaborator Author

!lightninglabs-deploy mute

@Roasbeef Roasbeef deleted the branch lightningnetwork:0-18-staging November 10, 2023 21:31
@Roasbeef Roasbeef closed this Nov 10, 2023
@Roasbeef
Copy link
Member

Deleted the staging branch, but this should go into master.

I can't re-open this for some reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[spec]: start to set the dont_forward bit in ChannelUpdate
8 participants