Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sRGB conversion lookup tables #126

Closed
Ogeon opened this issue Mar 3, 2019 · 5 comments · Fixed by #279
Closed

sRGB conversion lookup tables #126

Ogeon opened this issue Mar 3, 2019 · 5 comments · Fixed by #279

Comments

@Ogeon
Copy link
Owner

Ogeon commented Mar 3, 2019

Using lookup tables for at least converting u8 sRGB to and from linear RGB would skip a bunch of calculations, at the cost of space for the table(s). u8 -> f32 is just a simple lookup, while f32 -> u8 is trickier, but should be possible. Maybe through interpolation or if f32 -> linear u8 -> sRGB u8 has a reasonable precision loss. It needs some research.

@okaneco
Copy link
Contributor

okaneco commented Apr 12, 2020

I think this could be a good idea for Lab too. I was working with converting large resolution images to Lab space and eventually back to Srgb. There's a very perceivable dropoff in speed of converting from Srgb to Lab as image size increases. I went down a rabbit hole of profiling/benchmarking and I'm not sure how much can be done.

These are the functions I used for benching. It gets a raw buffer from the image crate, converts to lab (the other to linsrgb), and then converts that back into the Rgb format for output.

pub fn lab_f32(img: &[u8]) -> Vec<u8> {
    let lab: Vec<Lab> = Srgb::from_raw_slice(&img)
        .iter()
        .map(|x| x.into_format().into())
        .collect();
    let srgb: Vec<Srgb<u8>> = lab.iter().map(|x| Srgb::from(*x).into_format()).collect();

    Srgb::into_raw_slice(&srgb).to_vec()
}

pub fn linsrgb_f32(img: &[u8]) -> Vec<u8> {
    let linsrgb: Vec<LinSrgb> = Srgb::from_raw_slice(&img)
        .iter()
        .map(|x| x.into_format().into_linear())
        .collect();
    let srgb: Vec<Srgb<u8>> = linsrgb
        .iter()
        .map(|x| Srgb::from_linear(*x).into_format())
        .collect();

    Srgb::into_raw_slice(&srgb).to_vec()
}

I set up a test project here https://github.com/okaneco/palette_toy_benchmarks

The timings represent the round trip time from Srgb -> color -> Srgb. The timings are brutal for Lab, and LinSrgb is a little less than half the time for Lab. In order to run enough samples or for enough time, you end up waiting a while. The increase in time is linear with the pixel count (but below a second or two it feels practically instant for smaller images).

1. Image dimensions: 600x338 - 202,800 pixels

animal-beach-claw-crab-584501-w600.jpg Time
lab_f32 [77.947 ms 78.372 ms 78.806 ms]
linsrgb_f32 [34.997 ms 35.218 ms 35.403 ms]
srgb_f32 [11.843 ms 11.923 ms 12.014 ms]
srgb_u8 [9.0099 ms 9.0387 ms 9.0686 ms]

2. Image dimensions: 5331x3001 - 15,998,331 pixels

animal-beach-claw-crab-584501.jpeg Time
lab_f32 [6.2878 s 6.3092 s 6.3310 s]
linsrgb_f32 [2.8557 s 2.8723 s 2.9109 s]
srgb_f32 [1.0217 s 1.0275 s 1.0330 s]
srgb_u8 [729.06 ms 730.50 ms 732.63 ms]

There's also a main.rs I used cargo flamegraph with to get an idea of where time is being spent. It uses the same two functions in separate loops. Unfortunately, I think you have to download the svg file in order to open it in your browser and interact with it. This is on a Mac and some dtrace settings are disabled for security so I'm not sure how much detail is missing. Some calls seem unavoidable like the math libraries, cbrt, powf, and round. Next come the matrices where a lot of time is spent for rgb_to_xyz and inverting, and into_format. Last thing I noticed was max_intensity and classify float for the matrix inversion determinant. Anything I tried changing usually ended up in no result or regression, the compiler doesn't seem to need help.

It would probably be a good idea to set up a benching harness to track improvements/regressions. I need to try again but the new changes with into/from_color seem to improve performance slightly in the range of 5-10% for the benches I was running. I had changed the path to a local palette checkout at one point in the Cargo.toml.

@Ogeon
Copy link
Owner Author

Ogeon commented Apr 12, 2020

Setting up a lookup table for converting between spaces will probably be a bit more complicated. The RGB table can use the same 1D table for all three components, while a table for converting between spaces will need to lookup based on all three components.

That said, I'm sure it would be useful. And I think a good first step, no matter what, is to set up some benchmarking.

The version on crates.io and here in the repo have different code for changing component types too, so that is also a factor. The old one may be harder to optimize. I'm also not sure how well the compiler is able to optimize other parts that involve floats (I heard long ago that they can't always be optimized without affecting the output), so there may be room for some manual improvements in general.

I would like to keep this particular issue focused on RGB, as it's a bit of a more isolated case, and discuss general benchmarking and lookup tables for color spaces separately. Just so it doesn't grow out of scope. 🙂

@Ogeon
Copy link
Owner Author

Ogeon commented Apr 12, 2020

Issue #95 is already an open discussion for benchmarking in general.

@Ogeon
Copy link
Owner Author

Ogeon commented Apr 18, 2020

I'm reserving this one for myself. I have some ideas I would like to explore, including changes to some of the traits.

@Ogeon Ogeon removed their assignment Mar 6, 2022
@Ogeon
Copy link
Owner Author

Ogeon commented Mar 6, 2022

I'm un-reserving this. My ideas were not that great and something really simple is probably the best. I will get back to it later if nobody else does.

bors bot added a commit that referenced this issue Apr 9, 2022
279: Split the TransferFn trait and add lookup tables for sRGB r=Ogeon a=Ogeon

This changes how the transfer functions work a bit:

* `TransferFn` has been split into `FromLinear` and `IntoLinear`.
* The two traits take two type parameters; one for the linear encoding and one for the non-linear encoding.

The split is useful for cases when a conversion is irreversible, as mentioned in #245. The two type parameters makes it possible to convert between number types while encoding and decoding. This makes it possible to skip `.into_format()` in some cases (currently when converting between `u8` and either `f32` or `f64`). In addition to that, I made those special cases faster by using [`fast-srgb8`](https://crates.io/crates/fast-srgb8), and a similar lookup table for `f64`.

## Closed Issues

* Closes #126.
* Closes #245.

## Breaking Change

* `TransferFn` has been split into `FromLinear` and `IntoLinear`.
* `{Rgb, Luma}::{from, into}_linear` and `{Rgb, Luma}::{from, into}_encoding` are now allowed to change the output type. This changes the API and type inference. The solution for the type inference is usually to remove `into_format` from the call chain.
* `{Rgb, Luma}::{from, into}_encoding` will only convert from linear encoding, not "cross convert" between encodings.
* The number type parameter and `'static`lifetime were removed from `RgbStandard`, `LumaStandard` and `RgbSpace`.



Co-authored-by: Erik Hedvall <erikwhedvall@gmail.com>
@bors bors bot closed this as completed in #279 Apr 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants