Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeker autoconnect #7798

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

endothermicdev
Copy link
Collaborator

Important

24.11 FREEZE NOVEMBER 7TH: Non-bugfix PRs not ready by this date will wait for 25.02.

Checklist

  • The changelog has been updated in the relevant commit(s) according to the guidelines.
  • Tests have been added or modified to reflect the changes.
  • Documentation has been reviewed and updated as needed.
  • Related issues have been listed and linked, including any that this PR closes.

This builds on #7768 to make our gossip seeker more capable at syncing the network graph. In the case of nodes with a small number of peers, it may be impossible to get a reasonably complete enough gossip sync. In this case, choosing a node at random to try to connect and request additional gossip seems reasonable.

Testing from 14 Sept 2024 found that only 30% of 60 nodes tested were able to sync >90% of the current network graph. This should significantly aid small nodes in syncing their gossip.

Changelog-Changed: Gossipd can now request connections to additional nodes for improved gossip sync

@endothermicdev endothermicdev added this to the v24.11 milestone Nov 8, 2024
Based on gossip sync data from random network peers, listening to only 5
peers will not reliably catch all gossip traffic.  For now, add extra
redundancy.
This can help us backfill any missing gossip if our current
peers haven't been the most reliable.

Changelog-Changed: Gossipd requests a full sync from a random peer every hour.
This will be used to drop underperforming gossipers instead
of choosing at random.
Previously they were chosen at random.  We should instead
prefer to strop streaming from the peer who has provided
the least novel gossip.
This will allow all gossipers to be rotated in ~5 hours similar to
how it operated with half as many gossip streaming peers.
in test_gossip_force_broadcast_channel_msgs now that the seeker
is asking for periodic full gossip syncs.
Gossipd uses this to ask lightningd -> connectd to initiate
a connection to a new gossip peer.  This can be used when
there are insufficient peers already connected to gossip with.

Changelog-Changed: Gossipd can now request connections to additional nodes for improved gossip sync
This does not validate a node announcement and address, but it
does select a node at random from the gossmap and asks lightningd
to attempt a connection to it.
Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs sync code. That will probably need a dev command to trigger it manually for testing, too...

Comment on lines +974 to +977
static struct node_and_addrs *get_random_node(const tal_t *ctx,
struct seeker *seeker)
{
struct gossmap *gossmap = gossmap_manage_get_gossmap(seeker->daemon->gm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better done by providing a "struct gossmap_node *gossmap_random_node(const struct gossmap *map);" in gossmap.c, which can use nodeidx_htable_pick. Then iterate until NULL, or you find one with an address.

Comment on lines +1036 to +1042
if(!random_node->addrs || tal_count(random_node->addrs) == 0)
status_broken("seeker: random gossip node missing address");

u8 *msg = towire_gossipd_connect_to_peer(NULL, random_node->id,
random_node->addrs);
daemon_conn_send(seeker->daemon->master, take(msg));
tal_free(random_node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify: don't hand addrs here, since connectd already has that logic internally.

/* FIXME: store random sync peer and don't select next time. */
status_peer_debug(&random_peer->id,
"seeker: chosen for periodic full sync");
normal_gossip_start(seeker,random_peer, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't do a sync, it just starts gossip from this point.

You need to send query_channel_range, then query_short_channel_ids for any we didn't know about in the reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants