-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit multiple data chunks from output Streams (was: Proper streaming) #179
Comments
If you've not already done so, have a look at the Stream discussion in #30. At the time, @jcupitt created a load-from-stream branch with some initial ideas, mainly around the creation of a VipsInputStream/VipsOutputStream. On a related note, I spotted this recent change in io.js that looks like it could make handing JavaScript Stream objects in C++ easier. |
My final thought back then was in this comment: #30 (comment) Thinking about it again, it might be possible to do a hybrid of buffering and streaming. You could read and store data off the input until there was enough there that you could get all the metadata you need to be able to decide how to handle the file. Once you've picked an approach, close the image and start again. This time you feed the reader the stored data from the first pass and only start reading off the input again when that runs out. At that point (presumably) you are decoding actual pixels. There are still some awful problems. What if you send 10 jpg images down a pipe to a resizer, but image 4 is corrupt. How will the reader recover from the error? |
I'm sorry, I missed out the very vital part that is that I'm talking about output, not input. I'm not that interested in being able to pipe more than one image and for that I think it would be more reasonable to use object-streams ( My use case is to stream the output image data to S3 (or to any other service for that matter...). My ideal usage would be like this: var out = sharp(bufferObject).resize(1080, 1080).min().toStream();
S3.upload({ Body: out, Key: 'my-image.jpg' }, cb); This could be done as a said in number 2 by letting Actually, even more usual is probably piping directly back to the browser on http request. server.on('request', function (req, res) {
Sharp(myBuffer).resize(1080, 1080).min().toStream().pipe(res);
}); Why I like to have |
Ah OK, that makes sense. I agree, this would be an nice cleanup to libvips. The load-from-stream branch does this already, I'll fix it up a bit and add it. |
@jcupitt So what's the resolution on this issue? It looks like the load-from-stream branch was closed. |
I spent a bit of time looking at the branch, but master has moved and merging it would have taken quite a lot of work. Another alternative would be GIO. It has a thing called GFile which does more or less the same thing as VipsInputStream / VipsOutputStream, but better: a POSIX-y IO API which can have memory, files, sockets, URIs, or whatever you like as the backend: https://developer.gnome.org/gio/stable/GFile.html We could rework the png/jpg/webp read and write operations to use GFile. We'd then be able to open images directly over http or ftp, for example:
No idea if that would be useful. On the other hand ... It would mean adding yet another required dependency. And VIpsInputStream / VipsOutputStream were small and easy to understand and extend. |
@jcupitt great work on the streams branch, a shame it could never be merged. Just adding my support for any work on true streams, which would allow significantly less memory usage since right now sharp needs to load the source image into memory, and then hold the whole output buffer in memory, before writing out. |
I'm trying to find a conclusion to this, did anyone succeed in getting sharp to write to a readableStream that could as suggested above be piped to a request from a browser. |
libvips 8.9 (due rsn) has true streaming: https://libvips.github.io/libvips/2019/11/29/True-streaming-for-libvips.html eg.
Obviously you could send to a browser instead. It would need some (unknown?) amount of work to integrate this in sharp. I think Lovell is planning to look at it when 8.9 finally ships. |
Yes, after there's a version of sharp that depends on the future libvips v8.9.0 we can look at the best way of handling its new Stream API. A possible first implementation could involve accepting file descriptors as input/output, although that is likely to have to be via new API and not the standard, idiomatic Node.js Stream API (which will still be supported using the existing implementation). A more complete solution might be to create custom libvips Stream objects and use these for passing data chunks to/from Node.js, but there are multi-threading considerations e.g. libvips will be running on a different thread to V8. I'm planning to migrate the internals of sharp to use Node.js's newish N-API first - see #1282 - sadly I don't see anything in N-API for Node.js Stream objects, plus this migration will remove access to things like Node.js' |
Huh that sounds a little frustrating. Do let me know if there's anything I can do to help. |
Thanks for the explanation - I'm not up enough on the details of the APIs to make intelligent comments on the choices made, but I think keeping a mind on the goal is important - i.e. to be able to interface with external (non-sharp) code. In my case .... I had a Node stream with the image rather than a file, (being pulled from the net or a cache, or a P2P process) and needed to pass it back to the browser request. There are probably other requirements for other people but I'm guessing these node streams are the common theme to most of them rather than file descriptors. Also ... can I suggest a minor change to the documentation to explain how to work with streams, it only took me a few minutes to find that sharp claimed to support streams, it took me a couple of hours - searching through git issues etc - to figure out how to plug it into streams, and I almost gave up when this thread implied (to me) that it wasn't supported (rather than what it actually says, which is that its supported but with bad memory handling). I'd suggest an edit to https://sharp.pixelplumbing.com/en/stable/api-constructor/ maybe one of the examples, best I could figure out was the following code.
This works (subject to the memory issues) but wasn't intuitive to figure this out from the docs and I'm not at all sure its the right way to do it. The instinct was to look at "toFile", it took a while to realize that wasnt the right way. Oh ... and thanks, once I figured this out, the library's saved me a bunch of work ! |
@mitra42 Glad you got it working; is this for use with the Internet Archive? The second example in https://sharp.pixelplumbing.com/en/stable/api-constructor/#examples which is generated from https://github.com/lovell/sharp/blob/master/lib/constructor.js#L53 demonstrates Stream-based I/O and I'm always happy to accept PRs with more examples and improvements to the JSDocs if you're able. |
@lovell Oops - sorry I missed that example - not sure how. Yes -this is for the Internet Archive - for the Offline Archive, a version of the archive running in devices as small as Raspberry Pi's in particular to support interaction between that box and a local copy of mediawiki running an extension built by David Kamholz for transcribing palm leaf books, it was using IIIF on the archive to zoom in on the full scan, and I needed to port that support to Javascript. I tried using IIIF via @jonallo 's iiif-image-server-node but the installer for that is broken (it fails to install sharp) and once I'd got sharp installed separately I realized sharp could probably be used better on its own than going through iiif-image-server-node > iiif-image It took all of 25 lines of code ! So pretty happy with that. https://github.com/internetarchive/dweb-mirror/blob/master/mirrorHttp.js#L636 |
@lovell is this still on the roadmap given |
Do I understand correctly? The underlying cli command |
As libvips/libvips#1443 has been merged is there any progress to make streams fully work in sharp as well? 😇 I'm currently already using sharp for profile image upload though they have a much lower max size than the images we want to allow in content so real streaming straight from an xhr upload straight from the browser and then piping to R2 would best. |
This improvement will greatly enhance the end user experience. In our current process, we download the source image and pipe it to Sharp for resizing. Then, we pipe the resized image to an S3 bucket and simultaneously as a chunked http response. However, there is currently a delay because Sharp only emits one large chunk. By implementing proper stream emission, the final image will be delivered to end users more quickly. |
I'm currently working with large images, and true streaming would be ideal. I’ll start working on this unless someone else is already on it. |
I got streaming working, unfortunately I run into an issue that will require dropping it('concurrency', async () => {
const concurrency = 10; // any number equal or greater to UV_THREADPOOL_SIZE (4)
const promises = Array(concurrency).fill().map(async() => {
const file = fixtures.inputJpg320x240;
const input = fs.createReadStream(file);
const stream = sharp().rotate();
input.pipe(stream);
return buffer(stream);
});
const result = await Promise.all(promises);
// ...
}); Given that I rarely code in C++ and need to become more familiar with N-API, it might take me a bit. |
It would be very cool if
sharp
could support proper streaming. As it is now, it will always emit exactly onedata
event which is the whole image. It kind of defeats the purposes of streams.As I see it there is two ways to implement this.
The hard, but robust way.
Every time that
libvips
have data that should be written, it should call a function that we have defined. We then use this data to generate adata
-event on the stream. It would be really cool iflibvips
could provide a state-machine that has a synchronous function that gives the next chunk. That waypause
andunpause
would work perfectly.The quick and easy fix.
We can also create a named pipe and tell
libvips
to write it to that file viavips_image_write_to_file
. To the user we would then return aReadStream
as returned byfs.createReadStream(path)
.The text was updated successfully, but these errors were encountered: