-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image resizing feature #149
Comments
Hi I'd recommend to use |
FYI the It relies on autovectorization to achieve SIMD acceleration. Probably not as fast on large images as anything using AVX2 explicitly though. |
Interesting function, but it just tells you what pixel is supposed to be at the position x,y and not what is actually at that position, so it ends up doing a lot of redundant work when called within an image resize, the best way to do it would be a for loop and hoist calculations that don't change per iteration, e.g when resizing you wouldn't repeat the checks that return A more performant bilinear resize would look like this https://godbolt.org/z/rYxf6qTh1, but again only works on one channel, if you are on four channels and your output/input is floats it's way better since you can actually vectorize it. Which leads me to
Sadly, there is no autovectorization happening, floating point calculations in x86 use vector registers by default with the suffix telling us whether it's single precision( a single calculation at once) or packed precision(multiple calculations at once), So if you look closely below, we have and if you count how many times it is repeated, it's unrolled 4x times, each one for a single iteration. The one reason I can think of the compiler not autovectorizing this may be because of floating point associativity see What Every Computer Scientist Should Know About Floating-Point Arithmetic I couldn't nudge the compiler to produce vectorized code no matter how hard I try, so had to explicitly use SIMD, so here is an actual bilinear resize for f32 floats which uses SIMD https://godbolt.org/z/vzPEzGTs8 (using portable SIMD), notice how we have now |
Now present in |
Currently unimplemented in
bilinear_impl
The text was updated successfully, but these errors were encountered: