-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(quic): fix address translation #4896
base: master
Are you sure you want to change the base?
Conversation
d6339da
to
9b6ca0e
Compare
9b6ca0e
to
ae5fe62
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I wonder why it was implemented like this ..
Maybe @mxinden remembers.
let mut iter = translated.iter(); | ||
assert_eq!(iter.next(), Some(Protocol::Ip4(observed_ip))); | ||
assert_eq!(iter.next(), Some(Protocol::Udp(port))); | ||
assert_eq!(iter.next(), Some(Protocol::QuicV1)); | ||
assert_eq!(iter.next(), None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be cleaner to use assert_eq
against a constructed address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is translated 1:1 from TCP test, I can update it, but then TCP should probably be updated too (and maybe others?).
.with(Protocol::Ip4(observed_ip)) | ||
.with(Protocol::Udp(1)) | ||
.with(Protocol::QuicV1) | ||
.with(Protocol::P2p(PeerId::random())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this one important?
let quic_listen_addr = Multiaddr::empty() | ||
.with(Protocol::Ip4(Ipv4Addr::new(127, 0, 0, 1))) | ||
.with(Protocol::Udp(port)) | ||
.with(Protocol::QuicV1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a helper function that takes in an Ipv4Addr
and a port would make this test a bit more concise.
Some(observed.clone()) | ||
address_translation(listen, observed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what issue this is addressing.
The libp2p_core::address_translation
function takes the port of listen
and the IP of observed
. This is relevant in the case of TCP without port-reuse where one does not listen on the port of an outgoing connection, i.e. here were one does not listen of the port of observed
.
The relevant section of the address_translation
doc comment:
rust-libp2p/core/src/translation.rs
Lines 29 to 32 in 338e467
/// This function can for example be useful when handling tcp connections. Tcp does not listen and | |
/// dial on the same port by default. Thus when receiving an observed address on a connection that | |
/// we initiated, it will contain our dialing port, not our listening port. We need to take the ip | |
/// address or dns address from the observed address and the port from the original address. |
I argue that this is not relevant for QUIC, given that QUIC will use the same UDP port for both outgoing and incoming connections. The port of observed
is either equal to the port of listen
, or, if the local node is behind a NAT, equal to the public NAT port mapping of listen
. In the former case address_translation
is a no-op. In the latter case address_translation
returns the wrong port, namely LAN port, not the public Internet port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that with the upcoming redesign of #4568 we can get rid of Transport::address_translation
. Whether a connection has been established using port-reuse or not will be visible to a NetworkBehaviour
. Thus e.g. libp2p-identify
receiving an observed address can judge itself, whether it needs to alter any ports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also see #2289 (comment) for past discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what issue this is addressing.
This addresses the problem of ephemeral ports with QUIC. At least when behind NAT with port forwarding, without address translation you'll end up with invalid external port. The more annoying thing is that it will be valid for the duration of Autonat probing, but when later anyone else tries to reach the node, the port is no longer functional.
The port of observed is either equal to the port of listen, or, if the local node is behind a NAT, equal to the public NAT port mapping of listen.
This is most certainly not true in practice, at least in some cases. Even on my machine with pfSense acting as a router, I have seen random ports all over the place observed by remote nodes.
In the former case address_translation is a no-op.
Yes
In the latter case address_translation returns the wrong port, namely LAN port, not the public Internet port.
No, in this case address translation will result in correct port. I guess it depends on how to do port forwarding, it should be possible for router to map ports both ways, but I can't imagine many users actually doing it, it seems that external ports are mapped onto LAN ports, but probably rarely outgoing LAN ports are mapped onto specific WAN ports. This is my educated guess based on observation, I'm not claiming some consumer routers are not doing that.
Note that with the upcoming redesign of #4568 we can get rid of Transport::address_translation. Whether a connection has been established using port-reuse or not will be visible to a NetworkBehaviour. Thus e.g. libp2p-identify receiving an observed address can judge itself, whether it needs to alter any ports.
Interesting. Well, then this will need to be adjusted when that happens, but in the meantime network needs to function and lack of address translation was a big issue for us since many nodes were not able to discover their public addresses with QUIC (that we use almost exclusively in place of TCP now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also see #2289 (comment) for past discussion.
This means the current implementation is still wrong though as it will happily return an observed TCP address as the translated address. If QUIC is "before" TCP in the transport stack, it would return a wrong address, wouldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See lines 252-256 above, the protocol is checked there and returns None
if it doesn't match. Same is done in TCP and it works properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See lines 252-256 above, the protocol is checked there and returns
None
if it doesn't match. Same is done in TCP and it works properly.
Ah you are right, I missed that because the diff didn't show it 🙄
I don't really understand why we need to change anything here then. QUIC doesn't have a concept of ephemeral ports so address_translation
can't do anything useful. If you still observe changing ports with QUIC, it might be that you are sitting behind a symmetric NAT. We can't do anything against that I am afraid.
The more annoying thing is that it will be valid for the duration of Autonat probing, but when later anyone else tries to reach the node, the port is no longer functional.
We are aware of this issue. Unfortunately, it is a design flaw in AutoNAT and one of the things that prompted the design of AutoNATv2. In v2, the server uses a different outgoing port to perform the dial-back, meaning we don't accidentally "hole-punch" through the NAT and get a more trustworthy result that the address is actually reachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you still observe changing ports with QUIC, it might be that you are sitting behind a symmetric NAT. We can't do anything against that I am afraid.
But I still might have port forwarding configured such that the port matching my listening port is still reachable from the outside. This is what we recommend our users to do when they are behind NAT and it seems to work. It doesn't cover the case when forwarded port doesn't match listening port, in that case user has to specify external address explicitly, but it is still useful to do address translation, as mentioned above, in worst case it is no-op.
As discussed in the open maintainers call today, @nazar-pc can you research whether you are behind an endpoint indepedent or dependent NAT? |
I get unique random outgoing port for every
So even if my app is making request from port 30535, the actual request will go from a random port. If I make request from that port to a different host, the port will also be different. The only reason I'm reachable is because I have port forwarding from I see autonat doesn't actually care about the address, it will replace all received addresses with an observed one anyway and then will try to dial it. So response in autonat doesn't actually corresponds to request directly. What I don't like about this is that autonat server is essentially doing something similar to address translation, but it should still work. The problem is that because autonat client concatenates observed addresses with listen addresses rather than the other way around, it will in many cases confirm ephemeral external address instead of actual address with QUIC on my machine. I think eventually it will be able to find public address, but certainly not immediately. Another concern is that all local listen addresses are exposed when probing, which is somewhat revealing when running on the host and not in an isolated container. |
That sounds like a symmetric NAT to me. What is odd is that your NAT device doesn't seem to take into account the configured port-forwarding. Whilst not necessary for things to function, it would seem logical to apply the port-forwarding in a reverse fashion for outgoing connections to allow peers to use the observed address. In this case, address translation would be helpful because you've done a direct mapping of your port-forwarding. But, we don't know that in the application so not doing address translation is as good of a guess as doing it ... Once we ship #4568, we should be able to improve things here. In that PR, we know whether we reused a port or not for a new connection. This means, As a result, we can remove the current |
This pull request has merge conflicts. Could you please resolve them @nazar-pc? 🙏 |
So looks like you're saying that address translation will be moved to identify at some point in the future? I guess that makes sense, but I don't see why this PR shouldn't be merged in the meantime. I think it is pretty clear that there is an issue with QUIC right now and the fact that it works with Autonat v1 is a coincidence because Autonat v1 essentially also does address translation on its end, replacing IP in listen address with public IP, resulting in reachable address overall. On the surface it also looks like with Autonat v2 QUIC will break if no other changes are done because server-side address translation will not happen anymore (if my understanding is correct). |
Yes that is the plan. Identify is the protocol that gathers the observed address, meaning it is its job to filter out ephemeral ports. With the changes of #4568, we know whether an outgoing connection used a new port or the existing listen port. In the case of a new one, we can just discard the observed address because we know that it is garbage and will never be a valid candidate for an external address. Thus, there is no more need to do address translation.
It is not really clear to me. Are the nodes that you are observing this on listening for incoming connections or are they only making outgoing connections? If you don't have a listening socket then we will make one if a random port. Perhaps that is the issue you are observing? |
Description
This fixes address translation for QUIC that was essentially non-existent before.
Notes & open questions
Test is analogous to corresponding test in TCP protocol implementation. I have added test for both async-std and tokio, even though compiling crate with just tokio feature enabled causes a lot of compilation warnings.
What I'm not sure about is whether old behavior is intentional, but to be it seemed like a bug that caused major issues.
Change checklist