Image resizing feature #149

kvark · 2023-11-18T06:01:43Z

Currently unimplemented in bilinear_impl

The text was updated successfully, but these errors were encountered:

etemesi254 · 2023-11-18T08:15:42Z

Hi I'd recommend to use fast_image_resize as I work on that. But even my implementation won't do what you want to , it is only meant for single channel images(grayscale) since zune-image de-interleaves images into separate color channels, you seem to be wanting to use it for interleaved pixels, that won't work

Shnatsel · 2024-01-02T23:58:23Z

FYI the image crate recently landed an optimized bilinear scaling algorithm that works on de-interleaved pixels internally: image-rs/image#2078

It relies on autovectorization to achieve SIMD acceleration. Probably not as fast on large images as anything using AVX2 explicitly though.

etemesi254 · 2024-01-03T16:20:05Z

FYI the image crate recently landed an optimized bilinear scaling algorithm that works on de-interleaved pixels internally: image-rs/image#2078

Interesting function, but it just tells you what pixel is supposed to be at the position x,y and not what is actually at that position, so it ends up doing a lot of redundant work when called within an image resize, the best way to do it would be a for loop and hoist calculations that don't change per iteration, e.g when resizing you wouldn't repeat the checks that return None in the inner loop and you can remove the y part.

A more performant bilinear resize would look like this https://godbolt.org/z/rYxf6qTh1, but again only works on one channel, if you are on four channels and your output/input is floats it's way better since you can actually vectorize it.

Which leads me to

It relies on autovectorization to achieve SIMD acceleration. Probably not as fast on large images as anything using AVX2 explicitly though.

Sadly, there is no autovectorization happening, floating point calculations in x86 use vector registers by default with the suffix telling us whether it's single precision( a single calculation at once) or packed precision(multiple calculations at once), ss means single precision, ps means packed single precision see mulss and mulps.

So if you look closely below, we have mulss which usually mean multiply single precision, hence multiply one floating point number with the other

and if you count how many times it is repeated, it's unrolled 4x times, each one for a single iteration. The one reason I can think of the compiler not autovectorizing this may be because of floating point associativity see What Every Computer Scientist Should Know About Floating-Point Arithmetic

I couldn't nudge the compiler to produce vectorized code no matter how hard I try, so had to explicitly use SIMD, so here is an actual bilinear resize for f32 floats which uses SIMD https://godbolt.org/z/vzPEzGTs8 (using portable SIMD),

notice how we have now ps suffixes and a shorter output? means we finally got what we expected when we said SIMD.

etemesi254 · 2024-02-12T07:34:48Z

Now present in zune-imageprocs as Resize, hence I think this can be closed.

etemesi254 closed this as completed Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image resizing feature #149

Image resizing feature #149

kvark commented Nov 18, 2023

etemesi254 commented Nov 18, 2023

Shnatsel commented Jan 2, 2024

etemesi254 commented Jan 3, 2024

etemesi254 commented Feb 12, 2024

Image resizing feature #149

Image resizing feature #149

Comments

kvark commented Nov 18, 2023

etemesi254 commented Nov 18, 2023

Shnatsel commented Jan 2, 2024

etemesi254 commented Jan 3, 2024

etemesi254 commented Feb 12, 2024