Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Bidirectional Hole Punching from QUICClient and QUICServer #4

Closed
2 tasks done
CMCDragonkai opened this issue Apr 11, 2023 · 7 comments
Closed
2 tasks done
Assignees
Labels
development Standard development r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Apr 11, 2023

Specification

PK needs to use QUIC system for bidirectional hole punching.

QUICClient

The QUICClient starts a "dialing" phase when it is being created and awaiting the establishedP.

This results in it sending the Initial frame with an initial delay of 1, and doubling between.

This will repeat until the config.maxIdleTimeout is exhausted, which defaults to 5 seconds.

Each time it sends the initial frame, this is stuffed into a UDP packet which is sent to the server.

From this point of view, it is pretty much already doing a "hole punching".

The delay is currently not configurable, and currently I believe there's a bug in the QUICConnection.send that results in a delay pattern of 1, 2, 0, 4, 0, 8, 0.... I believe that even though we don't have more frames to be sent, there's a repeat of sending the UDP packet. This should be fixed.

This will occur until a connection is established or that the QUICConnection times out. If it times out, the QUICClient async creation should fail.

Once a connection is established there should be a keep alive mechanism to ensure that the NAT mappings are maintained.

QUICServer

The QUICServer on the other hand cannot just try to start a connection to the client side.

Instead we need to expose the ability to just send UDP packets to the client side. Right now the QUICServer.socket is a protected property... and in PK usage, the socket is likely to be created separately and shared between the client and server in the same peer.

But I think it would be ideal to expose the socket somehow... like being able to access QUICServer.socket.

So one can do server.socket.send().

This is actually an application concern, but we just want to support the interface to make this possible.

Tests should test that it's possible to craft random packets and send it via server.socket.send().

Random packets may be preferred over simply empty packets as we suspect that empty packets might be thrown away by routers.

Doing this will then result in the reverse hole punch, that will enable the QUIC client to connect.

Now when these packets are received by the client side, these packets are neither associated to an existing connection, nor can they initiate a new connection, so these packets will simply be thrown away. Tests should prove that this is the case.

At the same time once the connection is setup, there should be a keep alive component added to #6.

Additional context

Tasks

  1. Fix the weird delay pattern during the "dialing" phase of QUICClient Not a bug, seems to be intentional on quiche's part.
  2. Integrate the keep alive component Keepalive component after QUIC connection is established #6 when a connection is live. The interval should only be created after the connection is already established, not before it is established. Change to be made as part of Keepalive component after QUIC connection is established #6
  3. Add in the ability for the server to do a reverse dial proceedure.
@CMCDragonkai CMCDragonkai added the development Standard development label Apr 11, 2023
@CMCDragonkai CMCDragonkai changed the title Allow Bidirectional Hole Punching Allow Bidirectional Hole Punching from QUICClient and QUICServer Apr 11, 2023
@tegefaulkes
Copy link
Contributor

From what I recall we only need to check and fix the weird dialling pattern before this issue can be resolved. Is that correct @CMCDragonkai ?

@tegefaulkes
Copy link
Contributor

Ok, so the following is still left for this issue.

  1. Check and verify that the client dialling procedure isn't doing anything weird still.
  2. Create a test to check if the server can send hole punching packets to our intended client. We just need to check if we can initiate the server sending UDP packets to a target client address. That's the only requirement from the server perspective. The client's requirement is covered by the dialling procedure.

@tegefaulkes
Copy link
Contributor

Ok, so looking at the dialling procedure, I can see it retrying with a delay that is roughly doubling until it times out. That is expected. Every attempt after the first two sends two packets, the initial handshake crypto frame and a 2nd packet with a ping frame. This may have been what you were seeing before?

This seems intentional and not something like multiple connections being created and dialling.

1	0.000000000	127.0.0.1	127.0.0.1	52119	QUIC	1242	Initial, DCID=d2cab917bab2669cb08455824f084687, SCID=89031f015a4564af48942ec2451ae0ae41d9149b, PKN: 0, CRYPTO
2	0.000018723	127.0.0.1	127.0.0.1	52119	ICMP	590	Destination unreachable (Port unreachable)
3	0.999736040	127.0.0.1	127.0.0.1	52119	QUIC	1242	Initial, DCID=d2cab917bab2669cb08455824f084687, SCID=89031f015a4564af48942ec2451ae0ae41d9149b, PKN: 1, CRYPTO
4	0.999756405	127.0.0.1	127.0.0.1	52119	ICMP	590	Destination unreachable (Port unreachable)
5	2.999743648	127.0.0.1	127.0.0.1	52119	QUIC	1242	Initial, DCID=d2cab917bab2669cb08455824f084687, SCID=89031f015a4564af48942ec2451ae0ae41d9149b, PKN: 2, CRYPTO
6	2.999765215	127.0.0.1	127.0.0.1	52119	ICMP	590	Destination unreachable (Port unreachable)
7	2.999960143	127.0.0.1	127.0.0.1	52119	QUIC	1242	Initial, DCID=d2cab917bab2669cb08455824f084687, SCID=89031f015a4564af48942ec2451ae0ae41d9149b, PKN: 3, PING, PADDING
8	2.999967364	127.0.0.1	127.0.0.1	52119	ICMP	590	Destination unreachable (Port unreachable)
9	6.996612474	127.0.0.1	127.0.0.1	52119	QUIC	1242	Initial, DCID=d2cab917bab2669cb08455824f084687, SCID=89031f015a4564af48942ec2451ae0ae41d9149b, PKN: 4, CRYPTO
10	6.996633542	127.0.0.1	127.0.0.1	52119	ICMP	590	Destination unreachable (Port unreachable)
11	6.996930144	127.0.0.1	127.0.0.1	52119	QUIC	1242	Initial, DCID=d2cab917bab2669cb08455824f084687, SCID=89031f015a4564af48942ec2451ae0ae41d9149b, PKN: 5, PING, PADDING
12	6.996937402	127.0.0.1	127.0.0.1	52119	ICMP	590	Destination unreachable (Port unreachable)
13	14.989363271	127.0.0.1	127.0.0.1	52119	QUIC	1242	Initial, DCID=d2cab917bab2669cb08455824f084687, SCID=89031f015a4564af48942ec2451ae0ae41d9149b, PKN: 6, CRYPTO
14	14.989385697	127.0.0.1	127.0.0.1	52119	ICMP	590	Destination unreachable (Port unreachable)
15	14.989690223	127.0.0.1	127.0.0.1	52119	QUIC	1242	Initial, DCID=d2cab917bab2669cb08455824f084687, SCID=89031f015a4564af48942ec2451ae0ae41d9149b, PKN: 7, PING, PADDING
16	14.989697612	127.0.0.1	127.0.0.1	52119	ICMP	590	Destination unreachable (Port unreachable)

@CMCDragonkai
Copy link
Member Author

Did you see my reported delay pattern?

The delay is currently not configurable, and currently I believe there's a bug in the QUICConnection.send that results in a delay pattern of 1, 2, 0, 4, 0, 8, 0.... I believe that even though we don't have more frames to be sent, there's a repeat of sending the UDP packet. This should be fixed.

@tegefaulkes
Copy link
Contributor

I do see the pattern but my point is, since the packets are clearly pairs of handshake + ping packets at the expected delays, I think this is intentional. If it was repeating the handshake frames with no delay then I'd think it was a problem.

tegefaulkes added a commit that referenced this issue May 4, 2023
@tegefaulkes
Copy link
Contributor

Digging through the code I can see that this dialling procedure is fully driven by quiche. What we're seeing is after 2 failed attempts, any further attempts send two packets, the initial handshake frame and an additional ping frame. This behaviour is pretty consistent.

Looking at the logs I can see that the additional packet is generated by quiche within the same send loop. So it's quiche deciding to send two distinct packets.

In any case, this isn't something like multiple clients dialling at the same time like we originally though.

I'm going to chalk this up as intentional behaviour on the part of quiche.

tegefaulkes added a commit that referenced this issue May 4, 2023
* Related #4

[ci skip]
tegefaulkes added a commit that referenced this issue May 4, 2023
@tegefaulkes
Copy link
Contributor

tegefaulkes commented May 5, 2023

Some odd behaviour when providing an existing socket when starting a client. It's making calls to the socket.send() but no packets are actually being sent. Still digging into this.

Ah, when providing the socket to the client then it doesn't like a target host address in the IPv4 mapped IPv6 address '::ffff:127.0.0.1'. Sends are resulting in an EINVAL error.

I'm not providing a resolveHostname method when creating the socket. That's likely the problem and something to look out for.

tegefaulkes added a commit that referenced this issue May 5, 2023
tegefaulkes added a commit that referenced this issue May 9, 2023
* Related #4

[ci skip]
tegefaulkes added a commit that referenced this issue May 11, 2023
* Added a `keepAliveDelay` parameter to client, server and connections.
* Connections will send a ping frame after each interval, this should reset the timeout for both sides, given that the ping is received and responded to.
* Added tests for this feature.

* Fixes #4

[ci skip]
tegefaulkes added a commit that referenced this issue May 11, 2023
* Added a `keepAliveDelay` parameter to client, server and connections.
* Connections will send a ping frame after each interval, this should reset the timeout for both sides, given that the ping is received and responded to.
* Added tests for this feature.

* Fixes #4

[ci skip]
tegefaulkes added a commit that referenced this issue May 12, 2023
* Related #4

[ci skip]
CMCDragonkai pushed a commit that referenced this issue May 17, 2023
CMCDragonkai pushed a commit that referenced this issue May 17, 2023
CMCDragonkai pushed a commit that referenced this issue May 17, 2023
* Added a `keepAliveDelay` parameter to client, server and connections.
* Connections will send a ping frame after each interval, this should reset the timeout for both sides, given that the ping is received and responded to.
* Added tests for this feature.

* Fixes #4

[ci skip]
@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices label Jul 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices
Development

No branches or pull requests

2 participants