Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ability to open a hash as a seekable bao file #72

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

rklaehn
Copy link
Collaborator

@rklaehn rklaehn commented Mar 11, 2025

Description

Adds the ability to open a partial or complete hash as a file that supports std::io::Read and std::io::Seek. If you read at a position where the data is not present, you will get an io error, since all reads are validated.

The created file will be completely independent of the store, you can use it even when the store is dropped.

This relies on some experimental code in bao-tree. I will also support it in the new store.

The underlying implementation is using positioned_io::ReadAt, and I also have a wrapper similar to tokio::io::Cursor that wraps the file so that it impls tokio:io::AsyncRead and tokio::io::AsyncSeek.

Breaking Changes

None?

Notes & open questions

This is very WIP

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

Sorry, something went wrong.

Copy link

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh-blobs/pr/72/docs/iroh_blobs/

Last updated: 2025-03-11T19:08:54Z

@maboesanman
Copy link

maboesanman commented Mar 11, 2025

Upon reading the signature

pub async fn open(&self, hash: Hash) -> Result<impl std::io::Read + std::io::Seek> { todo!() }

I would personally expect this to be pending until the entire blob is downloaded, then give me readers that won't error (except under extreme cases like file permissions changing or storage devices being physically removed), but it sounds like the implementation (as described in the PR) gives you a future which resolves much more eagerly, independent of whether the blob is downloaded. This seems like it would be surprising and potentially frustrating if passing this to some code that expects read failures to be unlikely.

Maybe this api could be adjusted to something like:

pub async fn open_partially_downloaded(&self, hash: Hash) -> Result<impl std::io::Read + std::io::Seek> {
    // current implementation of the open function
}

pub async fn open(&self, hash: Hash) -> Result<impl std::io::Read + std::io::Seek> {
    let download_complete_fut = some_future(); // resolves when the download is complete
    let reader_future = open_partially_downloaded();
    let (_, reader) = join!(download_complete_fut, reader_future).await;
    Ok(reader?)
}

@rklaehn
Copy link
Collaborator Author

rklaehn commented Mar 12, 2025

Yeah, naming is TBD. This is for the YOLO case where you just want to try to read. I think that is a quite valid use case.

The other variant is also possible, but since it requires interaction with the store and the downloader it needs a living store and protocol handler, whereas this gives you a fully independent "file" that stays alive even if the store is dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

None yet

2 participants