Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support getting async and sync handles for existing blob store items. #60

Open
rklaehn opened this issue Feb 21, 2025 · 4 comments
Open

Comments

@rklaehn
Copy link
Collaborator

rklaehn commented Feb 21, 2025

We should have a way to get a handle from the store that allows you to treat a blob as a file in a sync context (implement Read and Seek traits) or in a (tokio-flavoured) async context (tokio::io::AsyncSeek and tokio::io::AsyncRead).

@rklaehn
Copy link
Collaborator Author

rklaehn commented Feb 21, 2025

There are a number of questions about this:

  1. which version is the more important for people, the sync one or the async one? Personally I am not very fond of the local io traits from tokio, but I realize that lots of people use them.

  2. does this need the out of process rpc, or not? We have spent a lot of work on an out of process rpc layer and on language bindings, but have come to the realization that lots of people use iroh directly from rust. So we are currently focusing on the "iroh is a rust library" use case most, and are deprioritizing cross process ffi and rpc. See https://www.iroh.computer/blog/ffi-updates

  3. there are 2 ways to do this, both technically viable. Not sure if we should offer either of them or both, and which one to implement first. This is only relevant for partial blobs, but these will become more important as we move from sendme use cases to other use cases

  • access only for verified data. Seeking to a place where the data is incomplete and then reading will produce an io error.
  • access for all data. Seeking to a place where the data is incomplete will read zeroes.

I guess in the spirit of content addressing the first one would be the one to implement first. You only ever get data that is consistent with the hash you have opened, or io errors. However, the second version is much faster since it does not need to validate anything.

@n0bot n0bot bot added this to iroh Feb 21, 2025
@maboesanman
Copy link

The API I would expect to find (which may or may not be realistic) is two functions:

read_completed(hash) -> Option<impl Read + Seek>

read_in_progress(hash) -> impl AsyncRead + AsyncSeek

The completed variants are basically pass throughs for file readers

The in progress variant would actually just wait until the bytes are available, indefinitely. Seeking around might inform the underlying blobs machinery to prioritize certain chunks. There may be other error conditions as well.

@maboesanman
Copy link

Perhaps the read_in_progress needs to be split into a seekable but unverified version and a non seekable verified version

@rklaehn
Copy link
Collaborator Author

rklaehn commented Mar 11, 2025

Here is a very WIP PR that adds the ability to get an impl Read + Seek for a complete or incomplete file. It can also do tokio AsyncRead and AsyncSeek. What it does not do however is to trigger downloads on seek - that would be another thing.

#72

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants