-
Notifications
You must be signed in to change notification settings - Fork 993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(swarm): replace address scoring with explicit candidates #3954
Conversation
For kademlia to be useful, we only want publicly reachable addresses in our routing table. Thus, as a general rule, we will only ever add a remote's address to the routing table if:
rust-libp2p/protocols/kad/src/behaviour.rs Lines 2048 to 2059 in cc5b346
(2) is where the client/server mode implementation comes in. A node SHOULD automatically enable/disable advertising the kademlia protocol based on whether it is publicly reachable on an address. For example, the inbound connection could have been hole-punched, meaning the address isn't generally publicly reachable and thus we should not advertise the kademlia protocol there. How do we figure that out? With protocol like AutoNAT and identify. To start with, a node only knows about its listen addresses. We inject them into the As soon as a connection is established (regardless of inbound or outbound), we run the identify protocol. Identify reports us the address that the other peer observed for us. This is essentially STUN for libp2p. For example, if we are listening on Using this PR, we can now emit this address via Let's assume for a moment that we are and AutoNAT reports that we are indeed reachable via Here comes the tricky question: What local interface does this correspond to? If the user specifies a wildcard address for listening, we will start listening on all interfaces available, including loopback. Filtering for private and loopback addresses is not enough though, a machine could easily have multiple NICs and thus multiple different IP addresses. Currently, we do not expose which local interface an outbound connection corresponds to: rust-libp2p/swarm/src/behaviour.rs Lines 178 to 190 in cc5b346
To go back to kademlia's client mode: The only thing we know there is which local interface a connection arrived on. To say with confidence that this interface is publicly reachable, we need a mapping of an external reachable address back to the local interface. I don't think our AutoNAT implementation currently does this. I think what we'd need to implement in this PR is the following:
@mxinden Please let me know what you think and whether you see any hole in this reasoning. I believe this is the most correct way to identify external addresses but it will also require some effort in implementing. A shortcut we could take in the meantime is to assume we are publicly reachable on all interfaces if we have at least one externally reachable address. |
Why do we need to know what the corresponding local interface is? |
If the machine has multiple NICs and thus multiple local interfaces then the external address only corresponds to one of them, correct? Say I have two local addresses:
The Ethernet connection is only connected to a LAN whereas the WiFi connection is routable to the public internet with a forwarded port. Say I am also running multiple libp2p nodes in my local network and thus have incoming connections on both interfaces. It would be wrong to advertise kademlia on the Ethernet connection because it would cause those other nodes in my network to include this address in the routing table despite not being globally reachable. I know this is a constructed example but people might have all sorts of weird network topologies. Fundamentally, I just think it might not be a good idea to assume any incoming connection is publicly routable just because we identified an external address for one of our connections? |
Thanks for expanding, that makes sense. I suggest treating the multiple-interfaces use-case as an additional feature. I prefer to solve the single-interface use-case first. As far as I can tell the single-interface use-case is a step forward, even when the long term goal would be the multiple-interface use-case. I am afraid that the multiple-interface use-case is a rabbit-hole which we are unable to solve properly without having a concrete use-case that we are optimizing for.
Meta level, I think what you are referring to here is "any listen address of an incoming connection is publicly routable". A connection itself is never routable, neither publicly nor privately within a subnet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me. Next to making client-mode a lot easier, I find this to be a nice clean-up.
/// - A protocol such as identify obtained it from a remote. | ||
/// - The user provided it based on configuration. | ||
/// - We made an educated guess based on one of our listen addresses. | ||
/// - We established a new relay connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any mechanism to confirm a relayed connection? I am not aware of a libp2p-autonat
implementation that would probe relayed addresses. That said, emitting them as NewExternalAddrCandidate
makes sense, as we can not confirm them. How would libp2p-identify
make sure to include relayed addresses when advertising the local node's listenAddrs
(i.e. external addresses)? Would libp2p-identify
ignore all FromSwarm::NewExternalAddr
addresses except for the relayed ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so but the relay could (should?) emit an event that the relay address is now an external address of ours.
How would
libp2p-identify
make sure to include relayed addresses when advertising the local node'slistenAddrs
(i.e. external addresses)? Wouldlibp2p-identify
ignore allFromSwarm::NewExternalAddr
addresses except for the relayed ones?
Why ignore them? A relayed address is just one more external address, isn't it? It is feasible to use relays to translate between IPv4 and IPv6 for example if node you want to connect to is only on IPv6 and you are only on IPv4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would
libp2p-identify
ignore allFromSwarm::NewExternalAddr
addresses except for the relayed ones?Why ignore them?
There seems to be a misunderstanding. I suggest libp2p-identify
to track all addresses confirmed via ExternalAddrConfirmed
. That would not include relayed addresses. Though one still needs to advertise ones relayed addresses via libp2p-identify
. I don't think one should advertise all NewExternalAddrCandidate
, but instead only the relayed NewExternalAddrCandidate
s.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can treat a relayed address as externally reachable without verifying it. We did connect to the relay after all so it is kind of implicit that others will be able to connect to the relay too. Thus, I think the relay behaviour could just issue a ToSwarm::NewExternalAddr
with the relayed address.
Sounds good to me. I'd like to add a log statement to kademlia then where we are making this assumption. Should make debugging easier in case someone runs into that issue.
Yep that makes more sense :D |
Co-authored-by: Max Inden <mail@max-inden.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me apart from the above small suggestion.
We should thoroughly test this in combination with #3877. Easiest step I would guess is to deploy it to kademlia-exporter.max-inden.de/.
We no longer test external addresses.
@mxinden Let me know what you think of 3c38b3f. I didn't see any value in keeping that part of the test because we now have explicit candidates that AutoNAT operates on. I did consider directly flipping to |
Additional commits look good to me. Only open discussion is #3954 (comment). |
I had to delete two more AutoNAT tests that aren't easily fixable with how we treat external addresses now. If you have an idea on how to fix them, feel free to push a commit. |
This reverts commit 62ec482.
@mxinden I restored the tests with the idea we discussed. The |
The comment is addressed and the deleted tests are restored, thus I am merging this based on the above approval. |
> Observed addresses (aka. external address candidates) of the local node, reported by a remote node > via `libp2p-identify`, are no longer automatically considered confirmed external addresses, in > other words they are no longer trusted by default. Instead users need to confirm the reported > observed address either manually, or by using `libp2p-autonat`. In trusted environments users can > simply extract observed addresses from a `libp2p-identify::Event::Received { info: > libp2p_identify::Info { observed_addr }}` and confirm them via `Swarm::add_external_address`. Follow-up to libp2p#3954.
> Observed addresses (aka. external address candidates) of the local node, reported by a remote node > via `libp2p-identify`, are no longer automatically considered confirmed external addresses, in > other words they are no longer trusted by default. Instead users need to confirm the reported > observed address either manually, or by using `libp2p-autonat`. In trusted environments users can > simply extract observed addresses from a `libp2p-identify::Event::Received { info: > libp2p_identify::Info { observed_addr }}` and confirm them via `Swarm::add_external_address`. Follow-up to #3954. Pull-Request: #4052.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think docs were missed here, let me know if it's me who is misunderstanding something
ExpiredExternalAddr(ExpiredExternalAddr<'a>), | ||
/// Informs the behaviour that we have discovered a new candidate for an external address for us. | ||
NewExternalAddrCandidate(NewExternalAddrCandidate<'a>), | ||
/// Informs the behaviour that an external address of the local node was removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these docs looks wrong to me, or I'm missing something
/// [`FromSwarm`] variant that informs the behaviour that an external address was removed. | ||
#[derive(Clone, Copy)] | ||
pub struct ExpiredExternalAddr<'a> { | ||
pub struct ExternalAddrConfirmed<'a> { | ||
pub addr: &'a Multiaddr, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these docs correct? They seem to belong to the type below?
I ended up in the pr because docs don't seem to relate to the type name
You are right, thanks for spotting! I'll see to fix it up later :) |
Discovered in #3954 (review). Pull-Request: #4135.
/// Informs the behaviour that an external address was removed. | ||
ExpiredExternalAddr(ExpiredExternalAddr<'a>), | ||
/// Informs the behaviour that we have discovered a new candidate for an external address for us. | ||
NewExternalAddrCandidate(NewExternalAddrCandidate<'a>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the events below miss their corresponding copies in SwarmEvent
, correct @thomaseizinger? Or was it a deliberate choice to not report these events to the user as well?
Related to #4688.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't a deliberate choice to omit them but I have to admit I also didn't know that we were activel trying to keep these in sync.
Previously, a `NetworkBehaviour` could report an `AddressScore` for an external address. This score was a `u32` and addresses would be ranked amongst those. In reality, an address is either confirmed to be publicly reachable (via a protocol such as AutoNAT) or merely represents a candidate that might be an external address. In a way, addresses are guilty (private) until proven innocent (publicly reachable). When a `NetworkBehaviour` reports an address candidate, we perform address translation on it to potentially correct for ephemeral ports of TCP. These candidates are then injected back into the `NetworkBehaviour`. Protocols such as AutoNAT can use these addresses as a source for probing their NAT status. Once confirmed, they can emit a `ToSwarm::ExternalAddrConfirmed` event which again will be passed to all `NetworkBehaviour`s. This simplified approach will allow us implement Kademlia's client-mode (libp2p#2032) without additional configuration options: As soon as an address is reported as publicly reachable, we can activate server-mode for that connection. Related: libp2p#3877. Related: libp2p#3953. Related: libp2p#2032. Related: libp2p/go-libp2p#2229. Co-authored-by: Max Inden <mail@max-inden.de> Pull-Request: libp2p#3954.
Description
Previously, a
NetworkBehaviour
could report anAddressScore
for an external address. This score was au32
and addresses would be ranked amongst those.In reality, an address is either confirmed to be publicly reachable (via a protocol such as AutoNAT) or merely represents a candidate that might be an external address. In a way, addresses are guilty (private) until proven innocent (publicly reachable).
When a
NetworkBehaviour
reports an address candidate, we perform address translation on it to potentially correct for ephemeral ports of TCP. These candidates are then injected back into theNetworkBehaviour
. Protocols such as AutoNAT can use these addresses as a source for probing their NAT status. Once confirmed, they can emit aToSwarm::ExternalAddrConfirmed
event which again will be passed to allNetworkBehaviour
s.This simplified approach will allow us implement Kademlia's client-mode (#2032) without additional configuration options: As soon as an address is reported as publicly reachable, we can activate server-mode for that connection.
Related: #3877.
Related: #3953.
Related: #2032.
Related: libp2p/go-libp2p#2229.
Co-authored-by: Max Inden mail@max-inden.de
Notes & open questions
Change checklist