Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(networking): set and use target outbound connections + prune #1739

Merged
merged 1 commit into from
May 18, 2023

Conversation

alrevuelta
Copy link
Contributor

Description

  • Before this PR nwaku nodes constantly tried to create new outcoming relay connections up to max connections. In theory, this could cause a node to only have out connections, which is a selfish behaviour since this takes a lot of incoming slots (of other peers) in the network.
  • This PR enforces a given in/out ratio of connections, leaving more incoming slots than outgoing, which is more altruistic with the network, something quite important in the fleets. It archieves so by:
    • prunning exceding inbound connections
    • not attempting mroe out connections if the target was reach.

closes #1495

Copy link
Collaborator

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Really interesting!
I added a few suggestions that I hope they help a little :)


for p in inRelayPeers[0..<connsToPrune]:
await pm.switch.disconnect(p)

proc connectToRelayPeers*(pm: PeerManager) {.async.} =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename this proc to something like updateRelayPeersConnections

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wops i missed this. isn't connectToRelayPeers shorter and more descriptive?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @alrevuelta ! It is shorter but this proc not only connectToNodes but also pruneInRelayConns. This is why I see it more convenient to have the "update" word on its name. Maybe sth like updateRelayPeers can suit better, which is even shorter ;P

Comment on lines +532 to +533
if outRelayPeers.len >= pm.outPeersTarget:
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also prune the exceeding out connection at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, but the thing is that since we control the out connections that we create, that shouldnt be necessary. i mean, we should never exceed the max out limit, since we do this check at connectToRelayPeers.

note that on top of this in prunning, we will have a prunning based on libp2p scoring, which will prune in/out connections from our mesh based on a given score. But this comes from gossipsub.

proc logSummary*(pm: PeerManager) {.async.} =
heartbeat "Log peer manager summary", LogSummaryInterval:
let (inRelayPeers, outRelayPeers) = pm.connectedPeers(WakuRelayCodec)
let maxConnections = pm.switch.connManager.inSema.size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it may be also interesting to print the in/out ratio, for debugging purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its there right? or wdym? we have inPeersTarget and outPeersTarget

inRelayConns = $inRelayPeers.len & "/" & $inPeersTarget,
outRelayConns = $outRelayPeers.len & "/" & $pm.outPeersTarget,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Morning! I mean to print the ratio that we expect. This is the real ratio that is being applied. I would print the theoretical ratio just in case it is not accomplished by the real one, for any reason.
Not a big deal though. If you consider we have enough info, then go ahead :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. if you dont mind i leave it as it is, since whats being printed here its more complete imho. eg: "...5/10, 30/50" which indicates that 5 in slots out of 10 are taken and 30 out of 50 out are taken.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool, is perfect then! thanks !

Comment on lines +516 to +518
proc pruneInRelayConns(pm: PeerManager, amount: int) {.async.} =
let (inRelayPeers, outRelayPeers) = pm.connectedPeers(WakuRelayCodec)
let connsToPrune = min(amount, inRelayPeers.len)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
proc pruneInRelayConns(pm: PeerManager, amount: int) {.async.} =
let (inRelayPeers, outRelayPeers) = pm.connectedPeers(WakuRelayCodec)
let connsToPrune = min(amount, inRelayPeers.len)
proc pruneInRelayConns(pm: PeerManager, connsToPrune: int) {.async.} =

As per how this proc is used, the min(amount, inRelayPeers.len) will always return amount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, as its used yep but you never know hoy a function will be used. good sanity check to avoid accessing out of bound elements.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, makes sense!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to pass list of peerIds to prune here though? Instead of amount - that way we could avoid counting/getting the peers again (you already do it before calling this function) and it would allow for potentially "smarter" ways of choosing the peers to prune - this just goes from first peer, but by passing an array of peerIds to prune the caller could come up with a different way how to choose the peers to prune

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have one comment on the selected out:in ratio, but non-blocking as I don't have a better suggestion. May have to come from experiments.

@@ -333,6 +337,7 @@ proc new*(T: type PeerManager,
storage: storage,
initialBackoffInSec: initialBackoffInSec,
backoffFactor: backoffFactor,
outPeersTarget: max(maxConnections div 10, 10),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud that 1:9 out:in ratio may be a bit permissive as this increases the eclipse attack surface. Arguably the most trustworthy relay connections from each node's perspective that will provide the most stable mesh are the outgoing connections based on random discv5 walks. Perhaps then a further consideration could be to ensure that e.g. 50% of the mesh peerings are outgoing (although that will probably break gossipsub in unexpected ways). I don't have any ratio or other suggestion here, but highlighting for future discussion and investigation. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, the eclipse attack was also my concern. took this ratio from both nimbus (1:9) and prysm (1:8) but we can revisit later on.

unsure how they justify this number given the possibility of eclipse attacks. Perhaps from a practical point of view its more complex since the "eclipser" is compeeting with the whole network to get these slots, so it would be very difficult for the attacker to end up having all slots?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably unrelated to the eclipse attack, but would it make sense to make to ratio configurable?

@alrevuelta alrevuelta merged commit 87f694a into master May 18, 2023
@alrevuelta alrevuelta deleted the in-out-ratio branch May 18, 2023 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat(networking): Keep given amount inbound/outbound peers.
4 participants