Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send multiaddress together with PeerID even after the routing table refresh interval #9264

Closed
3 tasks done
yiannisbot opened this issue Sep 12, 2022 · 16 comments
Closed
3 tasks done
Labels
kind/enhancement A net-new feature or improvement to an existing feature

Comments

@yiannisbot
Copy link
Member

Checklist

  • My issue is specific & actionable.
  • I am not suggesting a protocol enhancement.
  • I have searched on the issue tracker for my issue.

Description

Context

Currently, when someone is publishing something on IPFS, their multiaddress is included in the provider record for the first 10 minutes after publication (or republication) of the record. The 10 min setting comes from the routing table refresh interval. After that, it's only the PeerID that is stored on the record and served to requesting clients. So, if a client asks for the CID within the first 10 minutes of the CID publication, they get the multiaddress of the provider directly and can go connect directly. If they ask after the 10 minute mark, they get the PeerID and have to walk the DHT for a second time to map the PeerID to the multiaddress. This is to avoid the situation where a peer has changed their multiaddress (e.g., connected from a different location).

Proposal

But why can we not do both? The following sounds like a reasonable setting:

  • Both the multiaddress and the PeerID are provided to the client, irrespectively of whether the provider record was published in the last 10 mins.
  • The client is trying to contact the multiaddress included in the record, but also starts a DHT walk in parallel to do the mapping PeerID-> multiaddress.
  • If the multiaddress is still valid, the DHT walk is aborted. If not, it continues.

Impact & Tradeoff

This approach doesn't add any overhead, as clients have to do the DHT walk anyway (after the first 10 mins, which should be the majority of cases), according to the current setting. However, in case the multiaddress is still valid (which is not unlikely given that DHT servers are not expected to be switching their connectivity addresses very frequently), the client saves the latency of an entire DHT walk, i.e., reduces resolution time approximately to half of what it is today.

@yiannisbot yiannisbot added the kind/enhancement A net-new feature or improvement to an existing feature label Sep 12, 2022
@welcome
Copy link

welcome bot commented Sep 12, 2022

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

@yiannisbot
Copy link
Member Author

Copying discussion from slack.

From @Stebalien

The largest issue here is that libp2p won't try the DHT if it already thinks it knows a peer's addresses. This is a long standing issue that needs to be fixed by refactoring the host/network.

  1. The abstractions we currently have make this annoying. We need to resolve all addresses first before we start dialing.
  2. Checking the DHT is expensive, so we'd prefer to not do that unless we really need to.

One solution would be to:

  1. Try connecting with the addresses we know.
  2. If that fails, try the DHT.
  3. If we learn new information, try again.
  • We'll likely need to tune the backoff. Per-address backoff is currently 5s, but we probably want to increase the base?
  • We'll likely want to set a shorter timeout the first time we try to connect because we're less sure of the addresses.
    Ideally, we'd be able to feed additional addresses to an in-progress dial, but that requires a more complex dialer refactor (one we've been intending for a while, but haven't implemented).

@marten-seemann
Copy link
Member

Coincidentally, I wrote up issue how to make libp2p's dial more efficient and concurrent: libp2p/go-libp2p#1785. TLDR: libp2p should sort addresses before starting to dial, and start dial attempts with short intervals in between. With this proposal, DNS address resolution will be run in parallel.

One could design logic (and a whole new API) to feed in new addresses from DHT to an existing Connect call. I'm wondering however if that's worth the effort. As mentioned above, checking the DHT is expensive, and should be avoided if not necessary.

What do you think of the following logic?

  1. Start a Connect call with the known addresses. The default dial timeout is 15s (probably too long), but a shorter timeout could be chosen by passing in a context.Context.
  2. If dialing yields a connection, great. If not, query the DHT for new addresses.
  3. If this gives new addresses, delete the old address set from the peer store, and add the new addresses.
  4. Restart Connect, which will now use the new addresses.

Possible optimization: One start querying the DHT while the first Connect call is still running, let's say if it hasn't returned after half the dial timeout.

All of this logic would live inside the routed host. This seems like a fairly self-contained change, I don't expect any large refactor to be necessary for this.

@yiannisbot @Stebalien, wdyt?

@cortze
Copy link
Contributor

cortze commented Sep 30, 2022

Hey! I've been taking a closer look at how Provider Records are shared when looking for a CID in the DHT and I found a possible limitation from the current go-libp2p - go-libp2p-kad-dht combination that could affect always sharing the provider's PeerID + its Multiaddrs.

Just a few pointers related to this Issue:

if a client asks for the CID within the first 10 minutes of the CID publication, they get the multiaddress of the provider directly and can go connect directly.

Since go-libp2p@v0.22.0 the TTL for the ProviderAddrTTL got increased from 10mis -> 30mins, although not all the network updated kubo version.

This is something that I could clearly see in my measurements summarized here.

For some context, I was publishing random CIDs to the network from Host_A and requesting the PR of those CIDs from Host_B. To be precise, I was requesting the providers directly from each peer contacted to keep the PRs, tracking whether they replied with the PeerID + []Multiaddr or only the PeerID.

The following picture describes the number of peers contacted to keep the PR, that shared both PeerID + []Multiaddr over the study. Please, note that the X axis shows the ping_rounds, which are delayed 3mins from each other. We can see that by the median, 17 peers share the full AddrInfo of the Content Provider for ~10mins while only 7 do it for the full 30 mins.

img

I got a similar result when checking the results of the DHT lookups for the same CIDs. The major difference is that here the lookup was modified to not check the local dht.ProviderStore and not to finish the process after we found the a first provider.

img

The interesting part comes when I compare this last image with the result that the Lookup function returned (next image). Although there were always peers returning the full AddrInfo for the first 30 mins, the lookup did not always return the []Multiaddr filled. The reason? Because it already returned a Provider with the same PeerId and an empty []Multiaddr field (check the logic of the peerSet in the lookup function), which would force to perform a second DHT lookup to match PeerID and []Multiaddr , and that was avoidable by just hanging for a few milliseconds more.

img

Coming back to the actual discussion, we could avoid having to make an extra lookup if we share the Provider's ID and the multiaddrs at least for the 12 or 24 hours after a CID is published. Which are the chances that a Publisher updates its IP after publishing content?
However, we would still need to check how the LookUp is performed to avoid forcing that second lookup due to the current logic.

One could design logic (and a whole new API) to feed in new addresses from DHT to an existing Connect call. I'm wondering however if that's worth the effort. As mentioned above, checking the DHT is expensive, and should be avoided if not necessary.

About @marten-seemann 's comment, I am not sure which is the most convenient way of addressing it, but being able to add multiaddrs to an ongoing connection attempt is something that could work, and that would fit a parallel lookup to find the multiaddres if we are not sure if the one reported with the PR is valid.

@yiannisbot
Copy link
Member Author

Thanks folks for continuing the discussion here! Replying to a couple of notes:

@marten-seemann

One could design logic (and a whole new API) to feed in new addresses from DHT to an existing Connect call. I'm wondering however if that's worth the effort. As mentioned above, checking the DHT is expensive, and should be avoided if not necessary.

What I'm arguing here is that we end up doing that second lookup anyway (assuming that the majority of content is requested >30mins after (re-)publication, which I would argue is true) so why not do it earlier. So I'm definitely in favour of the suggested logic, plus the optimisation :)

The default dial timeout is 15s (probably too long)

:-o This is way too long! :-D I think if a peer has not responded within 1s, the peer is slow anyway, so I would argue it's not worth the wait. Is there any other reason why this is set to such a high value? I would probably set the dial timeout to 1s (maybe a bit more) and do the optimisation (start the DHT query) at 500-750ms after the initial dial. We could set up monitoring infra to see how often do we actually get to connect through the pre-existing multiaddress vs having to fall back to the DHT (e.g., through experiments using Thunderdome) and then increase/decrease that interval.

@cortze

and that was avoidable by just hanging for a few milliseconds more

Who would hang for a few more milliseconds? I'm not sure I'm getting this.

However, we would still need to check how the LookUp is performed to avoid forcing that second lookup due to the current logic.

I think that's ^ the bottomline here. @marten-seemann do you have any idea why this is happening? Otherwise I'm in favour of:

we could avoid having to make an extra lookup if we share the Provider's ID and the multiaddrs at least for the 12 or 24 hours after a CID is published.

with the caveat that we'd have to have that timeout (at half the dial timeout) to avoid hanging forever.

@marten-seemann
Copy link
Member

This is way too long! :-D I think if a peer has not responded within 1s, the peer is slow anyway, so I would argue it's not worth the wait.

Agreed that 15s is too long, but 1s is definitely too short. Keep in mind that a libp2p handshake over TCP takes at least 4 RTTs, and more if there's packet loss and you need to retransmit packets. Other transports take even more RTTs (WebSocket, WebRTC).
Maybe 5s for a single dial would be a better value? However, once we implement libp2p/go-libp2p#1785, there might be additional delays, if dialing QUIC is unsuccessful and we have to other transports.

What I'm arguing here is that we end up doing that second lookup anyway (assuming that the majority of content is requested >30mins after (re-)publication, which I would argue is true) so why not do it earlier. So I'm definitely in favour of the suggested logic, plus the optimisation :)

Not sure why the 30 min matter here. iiuc, if we request after 30 minutes, we don't get any addresses from our initial DHT query, right? We need to distinguish two cases then:

  1. We have no prior knowledge about this peer('s addresses): In this case, we need to go to the DHT immediately. This is the current logic implemented by the routed host in libp2p.
  2. We have interacted with this peer before, and have some addresses in our peer store. Most likely, these addresses are still valid, assuming that peers usually migrate addresses once every 24h at most, when their ISP assigns a new IP. A reasonable design goal would be that our timers avoid >90% of unnecessary DHT queries (where unnecessary in this case means that dialing the known addresses would have succeeded, if given infinite time).

@yiannisbot
Copy link
Member Author

if we request after 30 minutes, we don't get any addresses from our initial DHT query, right?

That's what we want to change: the peers that hold provider records return the Multiaddress of the providing peer for longer than the current 30 min interval. In this case we get an address to dial directly and if the peer has not changed their address (pretty likely for DHT Server settings) we'll be able to connect and avoid going through the DHT to do the PeerID -> Multiaddress mapping.

@dennis-tra
Copy link
Collaborator

I think the improvements to the dialling process are complementary to what Yiannis is suggesting. This might be obvious but the discussion seemed to gravitate to that. If I understand correctly we need to change two things:

  1. Always (or at least longer than 30 minutes after the provider record was written) return multi addresses upon requesting a provider record and
  2. Given your comment @marten-seemann : implement the logic of first dialling the given multi addresses and if that fails, looking up new ones in the DHT, and then dial again

From my understanding, the logic around IP/Transport prioritization, like the happy eyeball mechanism, would be a separate, additional improvement and not prohibitive of the above two changes, right?

All of this logic would live inside the routed host. This seems like a fairly self-contained change, I don't expect any large refactor to be necessary for this.

I had a look at the code and indeed the effort looks manageable (I'd be happy to step in and make that addition). However, I'm wondering if the sketched logic is always the desired behaviour of consumers of the API. I'm not sure where it's used and if it could lead to undesired side effects as it's such a central API.

@marten-seemann
Copy link
Member

From my understanding, the logic around IP/Transport prioritization, like the happy eyeball mechanism, would be a separate, additional improvement and not prohibitive of the above two changes, right?

Yes.

  1. Always (or at least longer than 30 minutes after the provider record was written) return multi addresses upon requesting a provider record and
  2. Given Send multiaddress together with PeerID even after the routing table refresh interval #9264 (comment) @marten-seemann : implement the logic of first dialling the given multi addresses and if that fails, looking up new ones in the DHT, and then dial again

The longer the period is, the higher the probability that these address have become stale in the mean time, so implementing 2. becomes more pressing.

I had a look at the code and indeed the effort looks manageable (I'd be happy to step in and make that addition). However, I'm wondering if the sketched logic is always the desired behaviour of consumers of the API. I'm not sure where it's used and if it could lead to undesired side effects as it's such a central API.

Happy to review a PR! I don't think there will be unintended side effects, users of a routed host should expect the host to do everything to resolve the address. But who knows what we'll discover when it's implemented... ;)

@dennis-tra
Copy link
Collaborator

dennis-tra commented Oct 11, 2022

The longer the period is, the higher the probability that these address have become stale in the mean time, so implementing 2. becomes more pressing.

Yup, I also think that 2. would be a requirement for increasing the multi-address TTL in 1..

Happy to review a PR! I don't think there will be unintended side effects, users of a routed host should expect the host to do everything to resolve the address.

Great, let me check what's involved in increasing the TTL and get back here.


Another thing, are the multi addresses for provider records only stored in memory? Could it then become an issue to keep those addresses for all provider records around? Not sure if we can derive a rough estimate on the increased resource consumption for a normal kubo node (whatever that is :D).

@yiannisbot
Copy link
Member Author

Good points brought up! I do think that trying the existing multiaddresses first makes sense, but for how long is still to be figured out. A straightforward setting is the provider record republish interval, although that is several times higher than the 30min value we have today. Also, setting it to a fixed value leaves a magic number in the codebase :) But not sure it can be any different/dynamic - an alternative would be for provider record holders to ping the multiaddresses they have periodically and return, or not, the multiaddress. But that would be significant extra overhead for record holders, which is not worth the effort (IMO).

In that sense, the timeout after which the requestor is going the DHT route is important. Would we want the DHT look up to start in parallel (and abort if the multiaddress is still valid)?

Complementary things we could do, but not sure how practically useful they would be are:

  • Set up a thunderdome experiment to replay gateway and bitswap traffic and see the success rate when the value (let's call it "multiaddress TTL"?) is set to 1hr, 2hrs, 4hrs ... up to 22hrs (the currently suggested record republish interval).
  • Build a tool (extend Nebula?) to periodically ping the pair <PeerID, Multiaddress> for every peer it learns about and see how frequently it changes.
  • Get a distribution of how long after the record has been (re-)published do we see incoming requests. Not sure how easy this is to do as we might need to cross-compare between request timestamps and hydra peer record receipt.

dennis-tra added a commit to dennis-tra/go-libp2p-kad-dht that referenced this issue Oct 21, 2022
We keep the Multiaddresses of providing peers around for much longer
as this means we'll return them alongside the provider records and
likely make the second DHT lookup for the peer record obsolete.

The assumption is that peers don't change network addresses often.
For context: ipfs/kubo#9264
@dennis-tra
Copy link
Collaborator

Created Pull Requests for the two changes:

dennis-tra added a commit to dennis-tra/go-libp2p-kad-dht that referenced this issue Oct 21, 2022
We keep the Multiaddresses of providing peers around for much longer
as this means we'll return them alongside the provider records and
likely make the second DHT lookup for the peer record obsolete.

The assumption is that peers don't change network addresses often.
For context: ipfs/kubo#9264
@dennis-tra
Copy link
Collaborator

I think this issue can be closed now? Or is something left? @yiannisbot

@yiannisbot
Copy link
Member Author

This is completed now and could be closed, but let's leave it open until we have some results to show the performance impact. We'll have everything in place in order to capture the performance difference and will report here.

@BigLep
Copy link
Contributor

BigLep commented Jun 21, 2023

@yiannisbot : I'm embarassed to say I missed this work. A couple of things:

  1. Did it ever get announced in a Kubo release?
  2. Is the probelab team actually going to get to measuring the performance difference? If we don't expect this to happen in 2023Q3, I think being realistic we close this. If we want to link to a measurement task on the probelab backlog, we can.

@yiannisbot
Copy link
Member Author

  1. Did it ever get announced in a Kubo release?

Not sure about this. I didn't track it, unfortunately.

  1. Is the probelab team actually going to get to measuring the performance difference? If we don't expect this to happen in 2023Q3, I think being realistic we close this. If we want to link to a measurement task on the probelab backlog, we can.

This has been forgotten, unfortunately and we didn't get the right measurement scripts in place to see the performance difference that it made in practice. It also coincided with a few other incidents at the beginning of the year, so it would be difficult to isolate the performance difference due to this anyway. Maybe that's why we didn't bother back then. In any case, this is good to be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

5 participants