Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve JPEG encode_block perf by about 25% with faster integer encoding #1912

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mcroomp
Copy link

@mcroomp mcroomp commented Apr 20, 2023

I license past and future contributions under the dual MIT/Apache-2.0 license,
allowing licensees to choose either at their option.

Also add benches for tracking performance of JPEG block encoding.

pre:
test codecs::jpeg::encoder::tests::bench_encode_block ... bench: 548 ns/iter (+/- 131) = 135 MB/s
post:
test codecs::jpeg::encoder::tests::bench_encode_block ... bench: 410 ns/iter (+/- 140) = 180 MB/s

@mcroomp
Copy link
Author

mcroomp commented Apr 20, 2023

Looks like NonZeroU32::unsigned_abs was stabilized in 1.64... let's see if there's a workaround

@HeroicKatora
Copy link
Member

We can also bump MSRV for a perf win, it's not a set-in-stone restriction. Bump here and as indicated in the comment to do it with the PR:

https://github.com/image-rs/image/blob/master/Cargo.toml#L7-L8

@Shnatsel
Copy link
Contributor

I see MSRV has been bumped to 1.67 since, which is higher than this PR's requirement of 1.64.

I understand there is nothing blocking this, and it can be merged?

The change is simple, covered by tests, and the performance gains are tantalizing.

@fintelia
Copy link
Contributor

Will look into reviving this PR once 0.25.0 is out

@mcroomp
Copy link
Author

mcroomp commented Mar 2, 2024

Rebased since this change had sat there for a while

@mcroomp
Copy link
Author

mcroomp commented Mar 2, 2024

There's still a CI test_toolchain with 1.63 that needs to be upgraded before this can be integrated

Copy link
Contributor

@fintelia fintelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR!

I think it is kind of doing a lot and would be better split into a bunch of PRs. Perhaps:

  1. Adding benchmarks to give a baseline performance number and justify subsequent PRs
  2. Optimizations to BitWriter
  3. New version of write_block
  4. New version of encode_coefficient

@@ -765,7 +855,8 @@ fn build_quantization_segment(m: &mut Vec<u8>, precision: u8, identifier: u8, qt
}
}

fn encode_coefficient(coefficient: i32) -> (u8, u16) {
#[cfg(feature = "benchmarks")]
fn encode_coefficient_old(coefficient: i32) -> (u8, u16) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should keep around an old implementation simply to benchmark against

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants