Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster BMP and JPEG decoding with any backend #1878

Closed
wants to merge 16 commits into from

Conversation

Shnatsel
Copy link
Contributor

@Shnatsel Shnatsel commented Mar 11, 2023

Avoid a memset() + memcpy() of the entire decompressed image, saving 10%-15% of the time in decoding JPEGs and 25%-30% in decoding BMP, measured on

ImageReader::open(input)?.with_guessed_format()?.decode()?;

I license past and future contributions under the dual MIT/Apache-2.0 license,
allowing licensees to choose either at their option.

@Shnatsel
Copy link
Contributor Author

Curiously this does not change the peak memory usage reported by time, and I don't understand why not. I'd expect it to be lower now that we don't have to keep two copies of the image in memory at the peak. Replacing with_capacity() with new() didn't change it either.

@Shnatsel

This comment was marked as outdated.

@Shnatsel
Copy link
Contributor Author

This exposed a latent bug in TGA decoder - streaming decoding never worked correctly, but now it's causing a panic on the tests. I don't know enough about TGA to debug this.

@Shnatsel Shnatsel changed the title Faster JPEG decoding with any backend Faster BMP and JPEG decoding with any backend Mar 11, 2023
@fintelia
Copy link
Contributor

fintelia commented Mar 12, 2023

Thanks for making this PR! The performance observations are super valuable for optimizing.

I'm a bit torn about whether to merge this as is. Special casing into_reader().read_to_end() (when called with an empty Vec) as being the most performant way to use a decoder feels like an abuse of all the APIs involved.

A skim of the BMP decoder suggests that it might not be too much work to avoid the copy_from_slice call and decode directly into the output buffer. This might not be ideal for JPEG though since jpeg-decoder and zune-jpeg don't seem to have APIs to decode into a pre-allocated buffer.

@Shnatsel
Copy link
Contributor Author

That's fair, it did strike me as a bit of an abuse of the API.

How about instead of having a single decoder_to_vec function that calls reads, we add decode_to_vec to the Decoder trait, with the version from before this PR as the default implementation? That should let us specialize the JPEG implementation without creating weird quirks that affect all decoders.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Mar 12, 2023

Closing this in favor of #1879

@Shnatsel Shnatsel closed this Mar 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants