-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing code generic over pixel type is hard #2358
Comments
More input on the missing pieces for generic code can be found here |
So there's a couple different aspects that you might want to abstract over:
With the current trait definitions, if you try to write a method that's generic over any |
We have a tracking issue for I think we should focus on making it easy to write code that covers all variants of |
I think that's a good principle. To be more specific list out what aspects we need to be generic over, we at a minimum need 1-3 below. Items 4-5 aren't currently tracked in
There's a bunch of other dimensions that can also be important, but I think can be left out of scope for now...
Next comes the question of how to be generic over the various dimensions. The current strategy has been to use the traits from I don't have a fully thought out design on how to improve this, but I think a critical piece may lie in defining the exhaustive set of possibilities for some dimensions so that users can directly dispatch on that decision. As an extreme example, we could remove the A less extreme example would be exposing a helper function that takes an image and calls one of three closures: either a closure taking an |
Fast blur is a great test-case for a generic image/pixel API. Computation-heavy generic code in Rust is usually ugly anyway. Even blur on a basic single channel of bare integers needs a painful amount of trait bounds. I've technically managed to write a generic one, but it's horrible: fn blur<P, S, X>(mut img: ImgRefMut<P>, radius: u8) -> Result<(), TryReserveError>
where
S: Sum<S>
+ std::fmt::Debug
+ std::cmp::PartialEq
+ std::ops::SubAssign
+ std::ops::AddAssign
+ std::ops::Mul<u16, Output = S>
+ std::ops::Add<Output = S>
+ Copy
+ std::default::Default,
X: Sum<X>
+ std::fmt::Debug
+ std::cmp::PartialEq
+ std::ops::SubAssign
+ std::ops::AddAssign
+ std::ops::Mul<u32, Output = X>
+ std::ops::Add<Output = X>
+ Copy
+ std::default::Default,
P: ComponentMap<S, u8, u16> + Copy,
S: ComponentMap<X, u16, u32>,
X: ComponentMap<P, u32, u8>,
{ called like
|
BTW, many image processing algorithms can't be generic over the purpose of the components or even the transfer function. Giving users access to a bunch of numeric pixel values regardless of the format, like For example, blur should be done in linearized RGB with premultiplied alpha. Everything deviating from that gets worse. Blending in sRGB is a super common mistake that makes green-red edges brown, makes black on white text thicker, but white on black text disappear. Blurring in YCbCr creates a ton of hue and brightness shifts. Blurring with uncorrelated alpha creates completely wrong colors at the borders, either uncovering previously invisible garbage or creating dark halos. Even a simple operation like converting from a transparent to opaque image by adding a background color needs the background color converted to a matching color space and blended properly. I haven't found a satisfactory solution for this. I just make pixel formats an explicit enum, and have helper functions that convert the images down to a smaller easier-to-support subset of pixel formats as needed. |
I haven't found cases where I'd want image to abstract away whether it's RGB or YUV, because image processing on YUV makes subsampling look even worse, so any image processing needs RGB, and outputs like JPEG or AVIF even when they support RGB, work better with YUV, so I want these conversions to happen anyway, and be explicit. For everything I work on, I found it sufficient to just support row-oriented storage with stride. This can also support planar types if necessary (like YUV chroma subsampling) by treating them as three images. I would like to develop a cross-crate API that allows combining row-based image decoding/processing/encoding similarly to Rust's Theoretically, in-memory storage could use tiles or even something clever like z-curve for better memory locality of 2D algorithms, but I couldn't find a case where I could justify that complexity. Separable kernels can be transposed, and large non-separable kernels are too expensive anyway. Copying an image once or twice is not expensive any more, so algorithms that really need a lot of locality can work on a copy. For really expensive algorithms, the Rust image layout/API stops being relevant, because the best way to make that faster isn't a smarter Rust type, but moving the computation to the GPU. |
Not in a great shape right now. There's
We're going to have to start requiring row-based slice access if we want to have performant algorithms: #2300 Given how much performance depends on cache locality and vectorization, not to mention the math and bounds checks on every pixel access that non-contiguous storage incurs, it's going to be cheaper to convert the image to a contiguous layout and run the operations on it than to keep it non-contiguous. And I cannot imagine what use case allowing non-contiguous rows would even serve.
Unless you're willing to copy-paste the same code 10 times, you still need some sort of method to abstract over the exact types involved. This involves either generics (like mine here, also with an excessive amount of bounds, and it doesn't even work for floats) or macros. I think making |
YUV works completely fine when you want to directly operate on something as intensity channel and avoiding chroma for ex. in CLAHE, and this cannot be done in RGB. Even this is worse than CIE LAB or Oklab, YUV conversion is cheap. However, I agree it is definitely better to do it explicit. |
At the moment everything of this is works well by doing so: fn amazing_f32_processing<const CHANNELS: usize>(f: &mut [f32]) {
panic!("Here is f32 {} channels", CHANNELS);
}
fn amazing_u8_processing<const CHANNELS: usize>(f: &mut [u8]) {
panic!("Here is u8 {} channels", CHANNELS);
}
pub trait AwesomeHandler<T> {
fn handle_awesome<const CHANNELS: usize>(data: &mut [T]);
}
impl AwesomeHandler<f32> for f32 {
fn handle_awesome<const CHANNELS: usize>(data: &mut [f32]) {
amazing_f32_processing::<CHANNELS>(data);
}
}
impl AwesomeHandler<u8> for u8 {
fn handle_awesome<const CHANNELS: usize>(data: &mut [u8]) {
amazing_u8_processing::<CHANNELS>(data)
}
}
fn make_some_amazing<T: Copy + AwesomeHandler<T>, const CHANNELS: usize>(f: &mut [T]) {
T::handle_awesome::<CHANNELS>(f)
} However, this requires a lot of understanding what is going on and tons of boilerplate but this is do not seems to be avoidable. Often everything if this might be grouped as an example here /// # Generics
/// `T` - data type
/// `J` - accumulator type
impl<T, J, const COMPONENTS: usize> HorizontalStackBlurPass<T, J, COMPONENTS>
where
J: Copy
+ 'static
+ FromPrimitive
+ AddAssign<J>
+ Mul<Output = J>
+ Sub<Output = J>
+ AsPrimitive<f32>
+ SubAssign
+ AsPrimitive<T>
+ Default,
T: Copy + AsPrimitive<J> + FromPrimitive,
i32: AsPrimitive<J>,
u32: AsPrimitive<J>,
f32: AsPrimitive<T> + AsPrimitive<J>,
usize: AsPrimitive<J>,
{
} and called like so impl AwesomeHandler<u8> for u8 {
fn handle_awesome<const CHANNELS: usize>(data: &mut [u8]) {
HorizontalStackBlurPass::<u8, f32, CHANNELS>
}
} But at the moment component type is unclear ( or at least I do not understand how to access it to play with it ). Or contains some untrivial trait which one is also is unclear how to work with it. |
Using impls is inherently a bad fit for this purpose. As @kornelski sketched, the most approachable way to generalize is by listing the available and supported representations at some point in an exhaustive manner. The conflict is that the equivalent solution of traits will always require a form of specialization. I don't think any arithmetic traits are sufficient. When we're seriously considering efficiency then a support for different SIMD implementations will need to happen. The best way to dispatch here is again if we have some exhaustive list, in those case not only exhaustive to the library but also exhaustive to the callee providing the operator that's being mapped. In terms of texel formats that is:
An interface with unconstrained generics such as That's in very general terms the approach I took while implementing the conversion paths in |
Yes, I'd also vote for exhaustive match in general. |
I think you would need to cast it to |
The |
We'll probably want row-contiguous views as well, which allow viewing rows as |
To be clear, |
I've lost a track already. Lines 64 to 69 in 25fbd3f
This is quite confising then and purpose of this is unclear. And what purpose to do for example PR with blur into repo if external crate have exactly same access as in crate. |
Even your contributor, @Shnatsel, perhaps didn't know before this time that it is expected to be accessed only by match on |
@awxkee There's a couple different things going on:
|
At least it is straightforward and allows you easy adopt traits for each case, or ignore some, if necessary, with a little boilerplate price for that.
I think this almost always will come with some breaking release, do not think this is real issue with this.
Thanks, I think I understood, at least main part of the issue |
I've tried to write code generic over pixel type here, but couldn't get the generic bounds to work out because
Primitive
doesn't provide arithmetic operations.Another instance of someone trying and failing to write generic code is here.
It seems the API for that could be improved. I don't have a solution in mind right now, but I wanted to write down these instances so that we could at least get an idea of what the issues are.
The text was updated successfully, but these errors were encountered: