-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
streaming support #32
Comments
Yes, that would be great. I'm not sure it's quite as easy with workers etc... It might be good to propose an API before writing it. Maybe the Streaming API is a good reference or at least good in the sense of it being a solution devs are already familiar with? Also, maybe this is already clear, but you can pass a blob currently, and only the parts of the zip file needed for an individual entry are in memory. Of course if that one entry is large then yes, you'd need streaming to handle it. Ideally, like the current API, you can stream it to a blob. In other words, it should be possible to do this
I know that's not part of a streaming API but it is something "streaming" internally would allow. I think right now, IIRC, it would run out of memory, though it's been a while since I've touched this code. |
Yeah, getting back a blob would be ideal. I think the ideal API would be to add a For our use case we actually need to read the entry headers and the central directory as well as the entry contents (uncompressed). We were hoping to mostly reuse the entry parsing logic from this library to get the offsets and reuse the readers to read data from it. I saw that I could get the entry offset from the Specifically this is going to be part of our official implementation for the IPFS wACZ custom chunking we're doing which will enable us to deduplicate content across web archive collections. https://github.com/webrecorder/specs/blob/main/wacz-ipfs/latest/index.md |
It doesn't seem like adding Off the top of my head
|
Blobs could just be piped through |
Hey, we're thinking of using this as part of Webrecorder for working with web archives.
one limitation is that we're dealing with files that are too large to practically store in an array buffer and need to use streaming interfaces.
Would you be interested in adding this functionality to your readers' or would you be interested in a pull request that would add this functionality? I think it should be easy to fit in with the existing code base.
The text was updated successfully, but these errors were encountered: