-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split data in rectangle chunks for parallelizable processing #33
Comments
Possibly related to #29 |
I'm going to try to implement this today for testing purposes |
Here's my attempt: github.com/rmn20/qoi In order to implement parallel processing, we need to clear an array of 64 previously seen pixels, as well as info about the previous pixel, on every chunk. I have tried several chunks sizes and here are my results:
(theoretically 8*8 should be the best because the cache size is 8^2, but 16x is the best) I also tried encoding chunks, but without clearing. This would not allow parallel processing, but THEORETICALLY it should improve the compression ratio.
Honestly pixel locality doesnt seem to improve compression much... Perhaps an improved QOI_INDEX could help here. (maybe we can store 3 indices in one 16 bit tag?) |
Is it due to the hardwares not being the same? |
I dont think the hardware can affect this. |
I think this is related to 30f8a39 |
I tried to change the order of pixels in chunks and combine chunks encoding with #38, and I think I got some interesting results. |
So I changed pixel order in chunks from this
to this
And here are my results:
I'm thinking at the moment, how can I show this combined code on github? I mean, I can't create a fork from a fork :P |
feel free to adopt my changes to your code, your results are very interesting (and I'm a bit sad that the 7% on kodim drop to half on other datasets) - maybe you could even try some fancier ordering, like jpeg zigzag, but I think it won't work well as this is not DCT |
the obvious things to try would be z curves or Hilbert curves. |
Updated the code on my fork |
amazing, this adds 7% more on top of what I had (kodim) - also when deflated now seems to beat png. pretty neat! |
qoi-style encoding still has problems with images like this one: my scheme somewhat improves cases like that (alpha mode), but still far from optimal (compared to png). I wonder if completely splitting alpha encoding (somehow) might help here, like qoi-style encoding with multiple pixels at once or something like that by the way - this image breaks when encoded with your chunk compressor, probably the mode switch command isn't handled properly or something like that? I see now, line 716 in the decoder, on mode change, there is
|
Yes, looks like I missed this moment |
The file format for QOI will not change anymore. See #37 for more info. Ideas for a successor to QOI should be discussed here: https://github.com/nigeltao/qoi2-bikeshed/issues |
The Idea is to add two more fields to the header: chunk width and chunk height (u8?) that defines blocks of data that can be processed independently from others (each have its own "64 last known pixels"). The number of chunks can easily be determined by dividing the image's dimensions.
There are several adventages to this:
The drawback is the added complexity, but I think the sacrifice would be worth it. I'm waiting for your feedbacks to start to implement and benchmark this.
Also, each chunk could be stored contiguously for a better data (and not pixel) locality. Maybe better, maybe too complex for this, dunno...
What do you think?
The text was updated successfully, but these errors were encountered: