Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] feat(iroh-sync): download from peer that informed us about a change #1314

Merged
merged 6 commits into from
Aug 7, 2023

Conversation

Frando
Copy link
Member

@Frando Frando commented Jul 28, 2023

Description

Try to download content from the peer that informed us about a change. During set reconciliation that is the peer we're talking to, during gossip it's the next peer in the graph (so not the sender of the message but the peer which forwarded the message to us).

This is not finished and waiting for a better downloader which will come soon, but might be interesting to try out now still.

Notes & open questions

  • We'll want some infrastructure to easily extend this to download from other peers too if both author and informer failed
  • The downloader will definitely need to support retries after timeouts, because right now you will easily get a gossip message before the other peer downloaded the content themselves

Change checklist

  • Self-review.
  • Documentation updates if relevant.
  • Tests if relevant.

@Frando Frando changed the base branch from main to sync-gossip-bytes July 28, 2023 12:15
@dignifiedquire
Copy link
Contributor

I think this is a great first direction, and getting this in and working to test it out would be great.

@Frando Frando force-pushed the sync-gossip-bytes branch 2 times, most recently from d2660ae to de61c04 Compare July 28, 2023 16:36
@Frando Frando force-pushed the sync-download-from-prev-peer branch from 3c243b9 to ba10245 Compare July 28, 2023 16:39
@Frando Frando force-pushed the sync-download-from-prev-peer branch from ba10245 to 5df8b13 Compare July 28, 2023 17:29
@Frando
Copy link
Member Author

Frando commented Aug 3, 2023

The Downloader currently acts immediately on new requests and doesn't do any retries. It also does not have a proper queue internally.

We want to be able to queue downloads with retries and different timeouts. My idea was to add a time-based queue, maybe using the TimerMap from iroh_gossip::util. Then I thought to change the API to expose download(hashes: Vec<Hash>, peer: Peer, delay: Option<Instant>). The actor would maintain a timer queue of pending requests and their delay, and poll that for the next operation, check if the operation is still needed, and if so enqueue the future to actually do the download. The actor should also keep track of running downloads, and have a limit of how many peers are being dialed or connected to in parallel (configurable). Code-wise, I thought to add at least a DownloadState for each pending download or so, and then similar to the current code have indexes into those for PeerId and Hash (and Instant if using something like the TimerMap).

With this, we can then:

  • On new entry message via gossip, enqueue download from
    • peer we recv'd from with a very small delay
    • from our other current active gossip neighbors with a larger delay
    • from the author with a larger delay plus random offset (*note: for this we need to add the peer info (address&region), with a signature, to the gossip message in addition to the SignedEntry)
  • Queue more retries if all fail up to some limit

Later on, we'd also want to add an explicit command to retry all missing hashes in a document from a peer, and optionally a mode that would do that automatically if we get new neighbors. This will initially be quite inefficient for large docs, because we have no ẁant/have` exchange in iroh-bytes. So maybe defer this a bit.

Edit
Another thought on the API, with peer and delay we'd also want to pass a enum RetryStrategy { None, Auto } or so likely, with auto being some incremental backoff with max retries or similar.

Edit 2
Currently the quinn::Connections are dropped and thus cancelled if there is no immediate next download pending after a download finished. Likely we'll want to keep them around for a configurable-with-default amount of time, and then drop them if no operation happened. Redialing is quite cheap but not free (one roundtrip in current setup).

@Frando Frando force-pushed the sync-download-from-prev-peer branch from 7465a07 to ed8cb2d Compare August 7, 2023 11:51
@Frando
Copy link
Member Author

Frando commented Aug 7, 2023

@divagant-martian I moved the downloader out from iroh/src/sync/content.rs to iroh/src/downloader.rs - it is not related to sync only so it makes more sense this way.

Once CI passes (apart from the crosscompiles) I'd merge this into the main iroh sync PR #1216 . Less PRs and a cleaner slate for you to build on.

After merging this, let's continue the discussion in #1334 - rklaehn wanted to add some comments too.

@Frando Frando merged commit b37aac4 into sync-gossip-bytes Aug 7, 2023
@dignifiedquire dignifiedquire deleted the sync-download-from-prev-peer branch November 1, 2023 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants