sRGB conversion lookup tables #126

Ogeon · 2019-03-03T17:14:25Z

Using lookup tables for at least converting u8 sRGB to and from linear RGB would skip a bunch of calculations, at the cost of space for the table(s). u8 -> f32 is just a simple lookup, while f32 -> u8 is trickier, but should be possible. Maybe through interpolation or if f32 -> linear u8 -> sRGB u8 has a reasonable precision loss. It needs some research.

okaneco · 2020-04-12T11:46:36Z

I think this could be a good idea for Lab too. I was working with converting large resolution images to Lab space and eventually back to Srgb. There's a very perceivable dropoff in speed of converting from Srgb to Lab as image size increases. I went down a rabbit hole of profiling/benchmarking and I'm not sure how much can be done.

These are the functions I used for benching. It gets a raw buffer from the image crate, converts to lab (the other to linsrgb), and then converts that back into the Rgb format for output.

pub fn lab_f32(img: &[u8]) -> Vec<u8> {
    let lab: Vec<Lab> = Srgb::from_raw_slice(&img)
        .iter()
        .map(|x| x.into_format().into())
        .collect();
    let srgb: Vec<Srgb<u8>> = lab.iter().map(|x| Srgb::from(*x).into_format()).collect();

    Srgb::into_raw_slice(&srgb).to_vec()
}

pub fn linsrgb_f32(img: &[u8]) -> Vec<u8> {
    let linsrgb: Vec<LinSrgb> = Srgb::from_raw_slice(&img)
        .iter()
        .map(|x| x.into_format().into_linear())
        .collect();
    let srgb: Vec<Srgb<u8>> = linsrgb
        .iter()
        .map(|x| Srgb::from_linear(*x).into_format())
        .collect();

    Srgb::into_raw_slice(&srgb).to_vec()
}

I set up a test project here https://github.com/okaneco/palette_toy_benchmarks

The timings represent the round trip time from Srgb -> color -> Srgb. The timings are brutal for Lab, and LinSrgb is a little less than half the time for Lab. In order to run enough samples or for enough time, you end up waiting a while. The increase in time is linear with the pixel count (but below a second or two it feels practically instant for smaller images).

1. Image dimensions: 600x338 - 202,800 pixels

animal-beach-claw-crab-584501-w600.jpg	Time
lab_f32	[77.947 ms 78.372 ms 78.806 ms]
linsrgb_f32	[34.997 ms 35.218 ms 35.403 ms]
srgb_f32	[11.843 ms 11.923 ms 12.014 ms]
srgb_u8	[9.0099 ms 9.0387 ms 9.0686 ms]

2. Image dimensions: 5331x3001 - 15,998,331 pixels

animal-beach-claw-crab-584501.jpeg	Time
lab_f32	[6.2878 s 6.3092 s 6.3310 s]
linsrgb_f32	[2.8557 s 2.8723 s 2.9109 s]
srgb_f32	[1.0217 s 1.0275 s 1.0330 s]
srgb_u8	[729.06 ms 730.50 ms 732.63 ms]

There's also a main.rs I used cargo flamegraph with to get an idea of where time is being spent. It uses the same two functions in separate loops. Unfortunately, I think you have to download the svg file in order to open it in your browser and interact with it. This is on a Mac and some dtrace settings are disabled for security so I'm not sure how much detail is missing. Some calls seem unavoidable like the math libraries, cbrt, powf, and round. Next come the matrices where a lot of time is spent for rgb_to_xyz and inverting, and into_format. Last thing I noticed was max_intensity and classify float for the matrix inversion determinant. Anything I tried changing usually ended up in no result or regression, the compiler doesn't seem to need help.

It would probably be a good idea to set up a benching harness to track improvements/regressions. I need to try again but the new changes with into/from_color seem to improve performance slightly in the range of 5-10% for the benches I was running. I had changed the path to a local palette checkout at one point in the Cargo.toml.

Ogeon · 2020-04-12T12:09:42Z

Setting up a lookup table for converting between spaces will probably be a bit more complicated. The RGB table can use the same 1D table for all three components, while a table for converting between spaces will need to lookup based on all three components.

That said, I'm sure it would be useful. And I think a good first step, no matter what, is to set up some benchmarking.

The version on crates.io and here in the repo have different code for changing component types too, so that is also a factor. The old one may be harder to optimize. I'm also not sure how well the compiler is able to optimize other parts that involve floats (I heard long ago that they can't always be optimized without affecting the output), so there may be room for some manual improvements in general.

I would like to keep this particular issue focused on RGB, as it's a bit of a more isolated case, and discuss general benchmarking and lookup tables for color spaces separately. Just so it doesn't grow out of scope. 🙂

Ogeon · 2020-04-12T12:15:45Z

Issue #95 is already an open discussion for benchmarking in general.

Ogeon · 2020-04-18T20:00:08Z

I'm reserving this one for myself. I have some ideas I would like to explore, including changes to some of the traits.

Ogeon · 2022-03-06T18:52:22Z

I'm un-reserving this. My ideas were not that great and something really simple is probably the best. I will get back to it later if nobody else does.

279: Split the TransferFn trait and add lookup tables for sRGB r=Ogeon a=Ogeon This changes how the transfer functions work a bit: * `TransferFn` has been split into `FromLinear` and `IntoLinear`. * The two traits take two type parameters; one for the linear encoding and one for the non-linear encoding. The split is useful for cases when a conversion is irreversible, as mentioned in #245. The two type parameters makes it possible to convert between number types while encoding and decoding. This makes it possible to skip `.into_format()` in some cases (currently when converting between `u8` and either `f32` or `f64`). In addition to that, I made those special cases faster by using [`fast-srgb8`](https://crates.io/crates/fast-srgb8), and a similar lookup table for `f64`. ## Closed Issues * Closes #126. * Closes #245. ## Breaking Change * `TransferFn` has been split into `FromLinear` and `IntoLinear`. * `{Rgb, Luma}::{from, into}_linear` and `{Rgb, Luma}::{from, into}_encoding` are now allowed to change the output type. This changes the API and type inference. The solution for the type inference is usually to remove `into_format` from the call chain. * `{Rgb, Luma}::{from, into}_encoding` will only convert from linear encoding, not "cross convert" between encodings. * The number type parameter and `'static`lifetime were removed from `RgbStandard`, `LumaStandard` and `RgbSpace`. Co-authored-by: Erik Hedvall <erikwhedvall@gmail.com>

okaneco mentioned this issue Apr 12, 2020

Add benchmarks to have a baseline and be able to track improvements #95

Closed

Ogeon self-assigned this Apr 18, 2020

Ogeon removed their assignment Mar 6, 2022

Ogeon mentioned this issue Apr 3, 2022

Split the TransferFn trait and add lookup tables for sRGB #279

Merged

bors bot closed this as completed in #279 Apr 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sRGB conversion lookup tables #126

sRGB conversion lookup tables #126

Ogeon commented Mar 3, 2019

okaneco commented Apr 12, 2020

Ogeon commented Apr 12, 2020

Ogeon commented Apr 12, 2020

Ogeon commented Apr 18, 2020

Ogeon commented Mar 6, 2022

sRGB conversion lookup tables #126

sRGB conversion lookup tables #126

Comments

Ogeon commented Mar 3, 2019

okaneco commented Apr 12, 2020

1. Image dimensions: 600x338 - 202,800 pixels

2. Image dimensions: 5331x3001 - 15,998,331 pixels

Ogeon commented Apr 12, 2020

Ogeon commented Apr 12, 2020

Ogeon commented Apr 18, 2020

Ogeon commented Mar 6, 2022