-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Core Image to convert between color spaces rather than Metal #1
Comments
Hi fumo, yes you are correct that Apple already implements colorspace conversions inside their own libraries. For example, one could simply play a .m4v video file with AVPlayerViewController, you can see an example of working code that uses this approach in the Xcode project build target named AVPlayerViewController. One could also use CoreImage to decode from YCbCr -> sRGB, an example of that approach is defined in the target named CoreVideoDecodeiOS. But, the problem with Apple software in general is that it is all closed source and it is very very difficult to determine what they actually did when implementing things and it is next to impossible to find actual working code to do any of this. What this project is attempting to do is create an actual working piece of software that is compatible with the way Apple decodes video and also provide a way to encode RGB data to video in a way that decodes back to the original RGB values. The "colorspace conversion" you describe is actually just gamma correction when converting from BT.709 -> sRGB since the color primaries are the same between these colorspaces. This Metal implementation address these problems and also provides a correct implementation of scaling since sampling from non-linear pixels is not trivial when rescaling is involved. |
Ah, so this project is just for academic purposes? |
No, I am going to produce another completely different commercial library based on this logic, but the point here is to create an real working implementation of what Apple actually did. What is critical is a correct implementation of the BT.709 matrix transform and the correct gamma correction at encode and decode time, because everything else depends on these two pieces of software being correctly implemented. After literally weeks of work, I think that what I have now is actually compatible, but it has been a strange trip getting to this point. Documentation stinks, there is almost no working code or examples, and most every piece of software I have looked at does things slightly differently and many of them are just plain wrong in one way or another. |
For encode, I think you can use the regular ITU-R 709 color profile since that is what the rest of the industry uses?
|
I tried that before, but it does not produce results that can be inverted by the default Apple gamma curve of 1.961. Basically, it seems that if RGB input is encoded, it would need to be boosted before hand by gamma 1.2 -> 1.25 (before being BT.709 gamma curve encoded) in order to be near the level that would then decode back near the original values. Using the exact inverse of the 1.961 gamma curve when encoding gives me the most exact RGB -> YCbCr -> RGB results while also maintaining compatibility with the way Apple would display professionally authored video from a camera and touched up in post production. It is weird, but this seems to be the most mathematically correct approach for these two use cases. You can see this in action by running the MetalBT709Decoder-iOS target and and uncommenting the call to decodeCloudsiPadImage in decodeH264YCbCr in the file AAPLRenderer.m. The original JPEG examined side by side on the same screen looks as close to identical as one could expect given that the gamma is completely different and the compressed M4V is almost half the size of the original JPEG. |
Hmm I’ve confirmed your results using ColorSync Utility. 1. Original sRGB Image2. ✅ ITU-R 709 ImageConverted from sRGB→709 using ColorSync Utility’s Match to Profile function. 3. ❌ HDTV ImageReplaced 709 with Apple’s HDTV profile using ColorSync Utility’s Assign Profile function. There is not enough contrast, so it’s hard to see the object on the left side of the wall and the detail in the woman’s hair is lost. @UliZappe I didn’t realize the difference between ITU-R 709 and Apple’s HDTV profile is so big. I assume the slope limit is already applied when viewing these images on an Apple operating system? |
In fact, the huge original Color Management (OS X): Image is too dark thread, from which the whole discussion started, began with the observation that assuming BT.709 as the video color space did not work correctly, whereas, as we found out after some time, BT.709 with gamma 1.961 (= Apple HDTV) did. Anyway, converting to one color space (ITU-R 709) and then assigning a different one (Apple HDTV) necessarily produces incorrect results in ICC color management.
Yep. This is also true for other operating systems, as long as they use an Apple, Adobe or Kodak CMM (or maybe even others I’m not aware of). If the Little CMS CMM is used, no slope limit is applied. |
I was simulating the situation where an encoding application uses the ITU-R 709 transfer function and a decoding application uses the approximated transfer function to see the loss in quality. |
One issue that I was confused about is what gamma curve does AVFoundation make use of when exporting with AVAssetWriter and AVAssetWriterInputPixelBufferAdaptor. Previously, I was seeing that the exported YCbCr values seemed to be larger than with BT.709, but I just retested and compared to sRGB and the results indicate that AVFoundation is also making use of the HDTV profile to export with a 1.961 gamma value even when sRGB is indicated as the base colorspace of a CoreVideo pixel buffer. This seems to indicate that my other export process that uses vImage and the 1.961 gamma settings is in fact correct. The weird thing is that there is a Y range issue with the AVAssetWriterInputPixelBufferAdaptor export as the emitted Y values are in the range [16, 237] as opposed to [16, 235]. The red line is BT.709 segmented curve, yellow cropped values over blue indicate the Apple196 line with the quant AVFoundation Y values shown as yellow, finally the purple line with orange sRGB curve shows the original sRGB values along with the sRGB exact curve. The point of the graph is that Apple internally emits YCbCr values by raising to pow(x, 1.0/1.961) so that the Apple decoding curve outputs RGB as close to the original values as possible. |
After additional testing, I am finding some very interesting results. I have reworked the srgb_to_bt709 command line tool that converts an image to Y4M (to be encoded with ffmpeg/x264) and I found some very interesting results when it comes to compression. The encoding process now supports Apple gamma 196 and also directly encoding with the sRGB gamma curve. The sRGB curve does just a little bit better in terms of round trip, though both approaches are very good in that they have a maximum round trip error of +- 2. What is interesting though is that encoding with a sRGB gamma seems to have an advantage when doing lossy encoding. Thing is, the sRGB encoded image data is only 67Kb while the Apple gamma encoding is 349Kb, that is a huge compression performance benefit from encoding with the sRGB gamma. Further testing would be needed to determine if this result can be seen with other example input. This specific example of a drop of water has some large smooth areas, so it might be an edge case. |
Unless I am misunderstanding something, it seems like you are introducing a lot of unnecessary complexity by using Metal to convert between color spaces. It looks like Core Image can do the job for you?
-[CIImage initWithCVPixelBuffer:]
will build and use the color space from the color space attachments in the image buffer.kCIContextWorkingColorSpace
can be used to specify the working color space of theCIContext
. You would use the same color space as theCIImage
(via-[CIImage colorSpace]
).CIImage
to any other image format, specifying the destination color space (in your case, sRGB) in the respective conversion functions.-[CIImage imageByColorMatchingWorkingSpaceToColorSpace:]
to get anotherCIImage
(e.g. to put into aUIImage
for display in aUIImageView
).-[CIContext createCGImage:fromRect:format:colorSpace:]
to get aCGImage
.-[CIContext render:toCVPixelBuffer:bounds:colorSpace:]
to render into aCVPixelBuffer
.-[CIContext render:toIOSurface:bounds:colorSpace:]
to render into anIOSurface
.-[CIContext render:toMTLTexture:commandBuffer:bounds:colorSpace:]
to render into aMTLTexture
.(Caveat: I have not personally tested any of these functions except the first one.)
The text was updated successfully, but these errors were encountered: