Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example: Multiprovider downloader #62

Open
wants to merge 47 commits into
base: main
Choose a base branch
from
Open

example: Multiprovider downloader #62

wants to merge 47 commits into from

Conversation

rklaehn
Copy link
Collaborator

@rklaehn rklaehn commented Feb 21, 2025

Description

Add the current state of the multiprovder downloader as part of an example.

The multiprovider downloader requires infrastructure from the store that as of now does not exist, specifically the ability to watch bitfields for blobs. I am working on a store refactor that will add this, but to try out the multiprovider downloader I just basically faked it for now.

For usage see the example README.md

Breaking Changes

None

Notes & open questions

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

Sorry, something went wrong.

WIP

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
WIP

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
WIP

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
This is mostly for the case where a peer download aborts with an error, since
if it just downloads the required data we will make progress when the local bitmap changes.
bitmap evokes the thought of 2d bitmaps like images
enum is just a wrapper around them now
This is a bit unfortunate, but there are major refactorings coming for the
store, and the current code has to fake a part of the functionality that
is needed from the store.

This way people can take a look at the code structure and also try it out.
Once the main store changes are merged, the multiprovider downloader will
move into the main crate.
Copy link

github-actions bot commented Feb 21, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh-blobs/pr/62/docs/iroh_blobs/

Last updated: 2025-02-21T11:40:29Z

@@ -252,7 +252,7 @@ pub async fn valid_ranges<D: MapMut>(entry: &D::EntryMut) -> anyhow::Result<Chun
}
let valid: ChunkRanges = valid_from_data.intersection(&valid_from_outboard);
log!("valid_from_data: {:?}", valid_from_data);
log!("valid_from_outboard: {:?}", valid_from_data);
log!("valid_from_outboard: {:?}", valid_from_outboard);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just improves the logging, fixes a bug in logging.

@rklaehn rklaehn changed the title Multiprovider example: Multiprovider downloader Feb 21, 2025
@rklaehn rklaehn marked this pull request as ready for review February 24, 2025 09:23
/// to add new peers to the list of options.
pub trait DownloadPlanner: Send + std::fmt::Debug + 'static {
/// Make a download plan for a hash, by reducing or eliminating the overlap of chunk ranges
fn plan(&mut self, hash: Hash, options: &mut BTreeMap<NodeId, ChunkRanges>);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this called options?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

options to download from. What goes in is the set of options to download from, once plan is done it contains the actual plan. But agree, it is not a great name!

@@ -241,7 +241,7 @@ pub async fn valid_ranges<D: MapMut>(entry: &D::EntryMut) -> anyhow::Result<Chun
// compute the valid range from just looking at the data file
let mut data_reader = entry.data_reader().await?;
let data_size = data_reader.size().await?;
let valid_from_data = ChunkRanges::from(..ChunkNum::full_chunks(data_size));
let valid_from_data = ChunkRanges::from(..ChunkNum::chunks(data_size));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a bug that led to valid_from_data being sometimes 1 chunk shorter than needed. This does not really affect you in the sendme case - it will just download 16 KiB more when resuming. But now that we want to be precise about chunk ranges, it does matter.

However, this whole valid_from_data is a hack/heuristic anyway. It makes the assumption that the data is contiguous until the size, which is true for sendme and also for normal iroh-blobs usage with the current API. But it is definitely no longer the case once you can have gaps in the valid range. This is one of many reasons for my current rabbit hole of redesigning the store to take bitfields into account properly including persistence.

As mentioned in the blog post, for small to medium blobs it is perfectly fine to just recompute the availabiltiy bitfield froms cratch. And it has the advantage that this is always correct, even if you do a bitfilp in the data or outboard. But for very large blobs you don't want to do it.

@rklaehn rklaehn mentioned this pull request Mar 10, 2025
22 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

None yet

2 participants