Skip to content

Use psnr to compare frames #662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Use psnr to compare frames #662

wants to merge 1 commit into from

Conversation

dvrogozh
Copy link
Contributor

For: #569

With attempts to support XPU acceleration (Intel GPUs) and aarch64 we see the need to generalize frames comparison in a way which would work across different platforms and implementations of color space conversion algorithms. In media domain such comparisons are being achieved by using metrics such as PSNR or SSIM. This commit introduces such comparison for all the platforms. Commit brings dependency from torcheval in tests which is used to calculate PSNR.

CC: @traversaro, @scotts, @NicolasHug

This commit brings dependency from torcheval in tests.

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 29, 2025
@dvrogozh dvrogozh mentioned this pull request Apr 29, 2025
@NicolasHug
Copy link
Member

Thank you for the PR @dvrogozh , some high-level comments:

  1. Let's not modify the existing checks for the platforms we natively support. Some of our checks are dumb-and-simple and there is value to that
  2. Instead of adding a new metric to account for other architectures, we should try to solve the more general problem of allowing out-of-core repos to provide their own frame comparison function. I'm not sure yet how to enable this, but we should let each platform decide individually, rather than have a centralized function within torchcodec.
  3. If possible, we should try not to rely on an extra dependency (torcheval). It's OK if we need to vendor some of the code especially if it's just for one metric. But, with 1 and 2, I don't think we'll need to add any new metric within torchcodec itself.
  4. Have you considered relying on this check for your XPU work? It is more permissive than what we currently do on linux:

    torchcodec/test/utils.py

    Lines 54 to 56 in ed13ac5

    # Asserts that at least `percentage`% of the values are within the absolute tolerance.
    # Percentage is expected in [0, 100] (actually, [60, 100])
    def assert_tensor_close_on_at_least(actual_tensor, ref_tensor, *, percentage, atol):

@dvrogozh
Copy link
Contributor Author

Thank you for your comments, @NicolasHug. This PR touches important topic on the approach used in torchcodec tests to validate correctness of video frames. I understand your motivation to use straight forward and simple bit exactness metrics. This indeed easiest thing to do. And that would work for most of decoders (at least most important once like avc, hevc, av1, etc.) - those who don't use floating point operations and are required to be conformant. The problems comes with video processing algorithms such as color conversion or scaling and further multiples when you start consider video encoding. That's basically a given fact that you can't use simple bit-to-bit exactness metric across different implementation having single truth baseline (taken from cpu or cuda or any other processor). To properly compare video frames you must use specific metrics such as PSNR, SSIM, MS-SSIM. My concern with the current torchcodec tests design is that the project will step into issues on a long run when code base and features will grow potentially spanning across multiple implementations on different platforms. Even as of today you handle video frames comparison with ~5 different paths, 2 for cuda, 2 for cpus and non-cuda, 1 for even more complex cases (assert_tensor_close_on_at_least you pointed me to). Add to that device specific checks in individual tests - as in test_get_frame_at_pts I caught working on this PR:

with pytest.raises(AssertionError):

I afraid that going forward there might be real issue with project sustainability.

By intent I did this PR in a blunt way from the sense of how I would approach this task on the day 0. Project is eventually past the day 0 and I totally understand desire to preserve current baseline checks. I think however we can still discuss this story and come to the better solution and help project maintenance in the future. I would like to propose the following:

  1. Rely on standard media metrics (PSNR or SSIM) as a design principal for torchcodec tests in cases when bit exactness can not be guaranteed per-design
  2. In addition to above, strengthen the checks for particular cases. Such as bit exactness for cpu x86_64 or minor deviations in case of cuda
    • In my mind, to this item would fall your proposal on "allowing out-of-core repos to provide their own frame comparison function"

In this way, 1st item will give us a baseline which all implementations of torchcodec should pass. Further, 2nd item allows to strengthen the requirements for particular cases. The assert_frames_equal in this case would look something like:

def assert_frames_equal(input, outpur, psnr_threshold):
  psnr = calculate(input, output)
  assert psnr >= psnr_threshold

  if input.device.type == "cpu":
    torch.testing.assert_close(*args, **kwargs, atol=0, rtol=0)
  elif input.device.type == "cuda":
    ...

If possible, we should try not to rely on an extra dependency (torcheval)

That's implementation details :). There is indeed a question whether torcheval is the way to go with PSNR/SSIM metrics - we need to have understanding on the future of this project to make a decision to rely on it. I hoped you might have some insight on that? We can consider pulling in PSNR implementation directly to torchcodec test utils or rely on other library for such calculations. That's the point to discuss.

Have you considered relying on this check for your XPU work?

That would be reinventing the wheel, i.e. attempt to define a metric while they already exist and used for media for decades - PSNR/SSIM/MS-SSIM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants