-
Notifications
You must be signed in to change notification settings - Fork 3
Feature Request: Multithreading (rayon
)
#3
Comments
It's currently not easy to implement multithreading efficiently due to limitations in On the topic of optimization: It's possible to perform more exotic optimizations like using SIMD to process 4 or 8 rows/columns at once per thread. I was also told that SIMD can make it possible to "unroll", or process more than one item per iteration (and that my implementation is "poorly written" for not doing so), but that's far outside my skillset. There is an experimental I do really like the generator API though (it's much more flexible than wrapping an iterator) and I'm sad that it can't reach an acceptable level of performance. Maybe I'll have to wait until Rust supports true coroutines. It's going to rot, because I'm not going to merge it in (it's too slow). For the non-state-machine branch, it looks like LLVM is somehow optimizing the horizontal blur in a special way (it likely does this for your vertical blur as well, but not mine, which is how yours manages to be faster). Without true coroutines I can't seem to get the non-generator version to optimize correctly. Someone is probably gonna have to look into this and see if it's possible. I have also tried multiple times (unsuccessfully) to reduce the |
I now have the iterators I need for this ( |
jk lol 7996893 try that |
Wow, very impressive! |
Nice! :) Glad to hear that it's starting to perform well. I'll open a PR for the |
@owenthewizard The new implementation of the |
It's great to see a correct stackblur implementation in Rust! I switched out the algorithm I use in i3lockr with yours and it takes about ~100 ms whereas the multithreaded C version of stackblur I was using takes ~30 ms. However, the C version is naive and clobbers the gamma. The 100 ms is including sRGB to linear round trip. I expect with multithreading yours may be very close to 30 ms, if not faster.
The text was updated successfully, but these errors were encountered: