-
Notifications
You must be signed in to change notification settings - Fork 31
Use psnr to compare frames #662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit brings dependency from torcheval in tests. Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Thank you for the PR @dvrogozh , some high-level comments:
|
Thank you for your comments, @NicolasHug. This PR touches important topic on the approach used in torchcodec tests to validate correctness of video frames. I understand your motivation to use straight forward and simple bit exactness metrics. This indeed easiest thing to do. And that would work for most of decoders (at least most important once like avc, hevc, av1, etc.) - those who don't use floating point operations and are required to be conformant. The problems comes with video processing algorithms such as color conversion or scaling and further multiples when you start consider video encoding. That's basically a given fact that you can't use simple bit-to-bit exactness metric across different implementation having single truth baseline (taken from cpu or cuda or any other processor). To properly compare video frames you must use specific metrics such as PSNR, SSIM, MS-SSIM. My concern with the current torchcodec tests design is that the project will step into issues on a long run when code base and features will grow potentially spanning across multiple implementations on different platforms. Even as of today you handle video frames comparison with ~5 different paths, 2 for cuda, 2 for cpus and non-cuda, 1 for even more complex cases ( Line 113 in ceeeb11
I afraid that going forward there might be real issue with project sustainability. By intent I did this PR in a blunt way from the sense of how I would approach this task on the day 0. Project is eventually past the day 0 and I totally understand desire to preserve current baseline checks. I think however we can still discuss this story and come to the better solution and help project maintenance in the future. I would like to propose the following:
In this way, 1st item will give us a baseline which all implementations of torchcodec should pass. Further, 2nd item allows to strengthen the requirements for particular cases. The
That's implementation details :). There is indeed a question whether
That would be reinventing the wheel, i.e. attempt to define a metric while they already exist and used for media for decades - PSNR/SSIM/MS-SSIM. |
For: #569
With attempts to support XPU acceleration (Intel GPUs) and aarch64 we see the need to generalize frames comparison in a way which would work across different platforms and implementations of color space conversion algorithms. In media domain such comparisons are being achieved by using metrics such as PSNR or SSIM. This commit introduces such comparison for all the platforms. Commit brings dependency from
torcheval
in tests which is used to calculate PSNR.CC: @traversaro, @scotts, @NicolasHug