-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate using ARM NEON instructions to speed up image processing #7
Comments
https://github.com/libjpeg-turbo/libjpeg-turbo uses NEON on ARM, so that part seems to be feasible. I’ll need to profile the g3 compression to see how to make it faster. |
Commit 2f7d475 brings down the rotation + g3 compression to 5s on the Raspberry Pi 3. The longest step now definitely is the JPEG encoding. |
This conversion used to take about 3s (for 4960x7016 RGB pixels). With the new code it takes about 300ms. More potential for improvement: we could run this code while reading pixels via USB. I haven’t looked at the USB timing in detail, but my guess is that we could squeeze in this post-processing into the time between requesting data from the device and receiving the data from the kernel. If that doesn’t work out, we could parallelize and post-process the previous buffer while reading the current buffer. Note that we need to use the WORD instruction because the Go assembler is lacking support for the NEON instructions, see golang/go#7300 related to issue #7
I got a proof-of-concept which uses a port of libjpeg-turbo’s NEON assembler functions. It completes JPEG encoding within 2.5s of wall-clock time. I’ll look into whether we can run the processing in parallel to reading data from the scanner. That way, we might be able to pull off scanning in almost real-time, i.e. with minimal wait after each page :). |
Next steps for cleaning up the optimized JPEG encoder (
|
This is a fork of Go1.8’s image/jpeg, changed to use data structures which are compatible with libjpeg-turbo, so that we can use the NEON assembler routines. related to issue #7
This is the current processing time for each piece of paper after scanning finished:
I looked into encoding the G3-encoded image data into a TIFF file (which works), but it turns out that neither Chrome nor Firefox support TIFF. All javascript-based TIFF/PDF viewers and extensions are super slow. |
Created issue #15 for speeding up the thumbnail creation. Closing this issue as we’re now using NEON code for jpeg compression. |
http://hilbert-space.de/?p=22 contains examples for how one could approach converting from color to grayscale.
We currently perform the following operations:
Binarization and rotation should be easy to implement, but also provide the smallest wins. Making JPEG encoding faster seems like the biggest win, but I’m not sure if that’s possible.
The text was updated successfully, but these errors were encountered: