Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Add QUIC Address Discovery to iroh #3049

Merged
merged 30 commits into from
Feb 3, 2025
Merged

feat!: Add QUIC Address Discovery to iroh #3049

merged 30 commits into from
Feb 3, 2025

Conversation

ramfox
Copy link
Contributor

@ramfox ramfox commented Dec 14, 2024

Description

There were two main things to "solve" on the iroh side:

  1. gross circular dependency between magicsock and the quinn endpoint
  2. needing to handle QAD packets in a special way, like we do with STUN packets that get sent from net-report

The first was accomplished by figuring out how to make MagicSock::spawn be the location that builds the quinn::Endpoint. This is what was changed:

  • implemented AsyncUdpSocket on MagicSock itself, rather than magicsock::Handle
  • magicsock::Handle now contains the quinn::Endpoint
  • we now pass down a server_config as a MagicSock::Option
  • in MagicSock::spawn:
    • after building the MagicSock we now build the quinn::Endpoint using Arc<MagicSock> as the AsyncUdpSocket
    • then we clone the quinn::Endpoint and passed it to the magicsock::Actor (which is independent of MagicSock)
    • give the quinn::Endpoint to the magicsock::Handle, which is the actual struct that we use to interact with the magicsocket
  • the iroh::Endpoint now interacts with the quinn::Endpoint using msock.endpoint() in all places

The second was accomplished by keeping a list of special "QAD addresses" on the NodeMap (NodeMap::qad_addrs), and using specific methods to add_qad_addrs, get_qad_addrs_for_send and get_qad_addrs_for_recv to deal with the fickle way that the quinn::Endpoint expects SocketAddrs to behave when it dials and when it receives packets:

  • before we do a net-report, we first attempt to resolve the RelayUrls in the RelayMap to get the SocketAddrs that we expect we will dial when we do a QAD probe
  • ~we add those addresses to the NodeMap ~
  • on the "send" side of the AsyncUdpSocket, after we do our normal checks for QuicMappedAddrs, we then check to see if the QuicMappedAddr is actually just a normal socket addr that we expect to use for QAD. If so, we just send the packets using the get_qad_addr_for_send address
  • on the "recv" side of the AsyncUdpSocket, after we check to see if the recv'd address can be mapped to a QuicMappedAddr & therefore can be received using our normal process, we then check to see if the address is one that we expect for QAD. If so, we make sure to associate the packet with the get_qad_addr_for_recv address and pass it along

The most "unreliable" bits of this are due to dns. Before running a net report, we now have to do dns discovery for all the RelayUrls. I've capped this at 300 ms but it means that until we cache the dns responses then we have this delay before starting net-report. I've also done a bit of a cheat: when we initially start the magicsock::Actor, I've added a call to resolve_qad_addrs that has a timeout of 10 ms, just to send out the dns packets and hopefully get a jump on caching the responses.

The second was accomplished by creating a new IpMappedAddrs struct that keeps track of IpMappedAddrs to the actual SocketAddr that it represents. This is akin to how we deal with QuicMappedAddr (which is now called NodeIdMappedAddr. netreport now takes an optional IpMappedAddrs, and when it resolves a RelayUrl, it adds the resolved IP addr to the IpMappedAddrs, and uses the mapped address when sending on QUIC. In iroh, we check to see if the destination or source address is an IpMappedAddr or a NodeIdMappedAddr. If so, it maps the addresses correctly before returning the transmit to the endpoint or sending the transmit over UDP.

Most interesting bit

So...if someone adds an IP address to IpMappedAddrs, we can allow folks to use the Endpoint as a sort of normal quinn::Endpoint, bypassing our special iroh holepunching sauce.

depends on #3032

Breaking Changes

  • iroh-net-report
    • changed
      • iroh_net_report::Client::new now takes additional parameter mapped_addrs: Option`

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

@ramfox ramfox changed the base branch from main to ramfox/net-report-simplify December 14, 2024 07:27
Copy link

github-actions bot commented Dec 14, 2024

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3049/docs/iroh/

Last updated: 2025-02-03T13:08:02Z

@ramfox ramfox force-pushed the ramfox/qad-in-iroh branch 3 times, most recently from 2a51de3 to d18f3d1 Compare December 14, 2024 07:47
@ramfox ramfox self-assigned this Dec 14, 2024
@flub
Copy link
Contributor

flub commented Dec 16, 2024

The second was accomplished by keeping a list of special "QAD addresses" on the NodeMap

The downside of this approach is that you need to lock the NodeMap once again on every single try_send and poll_recv call. That's not great.

What if we made a separate mapping outside of the NodeMap for this? We can make a new type IpMappedAddr (and rename QuicMappedAddr to NodeIdMappedAddr while we're at it). It would be a mirror of the current QuicMappedAddr but with a different ADDR_SUBNET.

So we use a new subnet in our existing IPv6 Unique Local Addr space. This subnet works just like the existing subnet otherwise. But is used to map real IP addresses to other fake ones.

This allows the code that needs to check whether something is a IpMappedAddr or a QuicMappedAddr (or NodeIdMappedAddr) to simply check the IPv6 prefix with no need to get any locks until it is know which one you have.

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way of connecting the quinn::Endpoint into the MagicSock is nice!

@ramfox ramfox force-pushed the ramfox/net-report-simplify branch from 1e24436 to 4f9db39 Compare December 16, 2024 16:48
@ramfox ramfox force-pushed the ramfox/qad-in-iroh branch 2 times, most recently from 59b2744 to ee67d96 Compare January 7, 2025 22:41
@ramfox ramfox changed the base branch from ramfox/net-report-simplify to main January 8, 2025 16:40
@ramfox ramfox force-pushed the ramfox/qad-in-iroh branch 3 times, most recently from 8b69c62 to 96aa2a5 Compare January 10, 2025 21:08
Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my biggest question is about why IpMappedAddr maps to a SocketAddr rather than an Ipv6Addr. But then I also have that question about NodeIdMappedAddr which I don't think should be solved in this PR. And since the IpMappedAddr works as is, I'm happy to defer that as well if you'd prefer.

Otherwise this looks really good, though I guess we agreed that the IpMappedAddr would move into net-report for now.

return Err(IpMappedAddrError(String::from("not mapped addr")));
}
let octets = addr.ip().octets();
if octets[6..8] != IpMappedAddr::ADDR_SUBNET {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to check that the entire prefix matches, [0..8], no? Just the subnet is not enough.

With some luck you can compute the full prefix in a const fn, but I'm not sure if that's possible. You can always just write a full const PREFIX: [u8; 8] = [0xfd, 21, 7, 10, 81, 11, 0, 1]; next to the existing constants.

Copy link
Contributor Author

@ramfox ramfox Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted this. I just checked each section of the slice, similar to how we fill the slice when we generate the addrs. Can you please look at this again and let me know if it's acceptable?

@ramfox ramfox force-pushed the ramfox/qad-in-iroh branch 2 times, most recently from 2b99483 to 2c94b82 Compare January 17, 2025 20:46
@flub
Copy link
Contributor

flub commented Jan 21, 2025

Apologies for the verbose comments on the docs. For public docs I think we should consistently follow the docs style guide as it is user-facing.

@ramfox ramfox force-pushed the ramfox/qad-in-iroh branch 2 times, most recently from d05a927 to 74e5797 Compare January 24, 2025 04:19
Copy link

github-actions bot commented Jan 24, 2025

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: a820e2e

@ramfox ramfox force-pushed the ramfox/qad-in-iroh branch 5 times, most recently from 2d65359 to 4c2d6b3 Compare January 28, 2025 04:42
“ramfox” and others added 10 commits January 29, 2025 15:58
Co-authored-by: Floris Bruynooghe <flub@n0.computer>
Co-authored-by: Floris Bruynooghe <flub@n0.computer>
Co-authored-by: Floris Bruynooghe <flub@n0.computer>
Co-authored-by: Floris Bruynooghe <flub@n0.computer>
Co-authored-by: Floris Bruynooghe <flub@n0.computer>
Co-authored-by: Floris Bruynooghe <flub@n0.computer>
Co-authored-by: Floris Bruynooghe <flub@n0.computer>
@ramfox ramfox force-pushed the ramfox/qad-in-iroh branch from 29ac612 to b1fefbd Compare January 29, 2025 20:58
@ramfox ramfox changed the title feat: Add QUIC Address Discovery to iroh feat!: Add QUIC Address Discovery to iroh Jan 29, 2025
@ramfox ramfox force-pushed the ramfox/qad-in-iroh branch from cc9fc32 to d6a729c Compare January 30, 2025 00:08
@ramfox
Copy link
Contributor Author

ramfox commented Jan 30, 2025

Okay, so the only two failing tests are semver, which should be failing, as there is a breaking change, which I've documented.

The second is netsim, which I've already worked with asmir works once this netsim PR is in: n0-computer/chuck#73 (review)

So can I please get a final review or approval?

Copy link
Contributor

@divagant-martian divagant-martian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beside the conversations that are still open and some nits on my side this lgtm. Approving for you to deal with what's missing and merge when ready

@ramfox ramfox added this to the v0.32.0 milestone Jan 31, 2025
@Arqu Arqu added this pull request to the merge queue Feb 3, 2025
Merged via the queue into main with commit 243a04a Feb 3, 2025
25 of 26 checks passed
@ramfox ramfox deleted the ramfox/qad-in-iroh branch February 3, 2025 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants