-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corner case is giving wrong VMAF and PSNR values! #371
Comments
For VMAF, please refer to the FAQ below: For PSNR, we force the value to be capped at 60dB for 8-bit representation and 72dB for 10-bit representation. This can be justified by the PSNR formula and the mean squared error caused when an arbitrary signal is represented by quantizing to x bits. |
This gives us a pretty good understanding of the use cases for VMAF. This corner case gives us the confidence that the precision of VMAF is in the best case 3 VMAF points, which is already 50% of a visual difference (6 VMAF points). Especially this has to be taken into account for encoder comparisons! Additionally, PSNR is capped at 60 dB. From the technical point of view there is no reason why PSNR values above 60 dB (8Bit) couldn’t be calculated. The downside is that VMAF in this version only can be used for the distribution of highly compressed video. But for the overall video quality also the quality and frequency range of the ‘Intermediate Distribution Master’ should be monitored. There is still missing a definition of ‘video resolution’. For the distribution, video lines and a VMAF (SSIM, PSNR) value gives us the confidence that the quality is acceptable for the most viewing conditions. But video quality and resolution are described by a much richer range of parameters, which are also important for the encoding ladder. |
For confidence of VMAF prediction, a better way would be to use bootstrapping to quantify the 95% confidence interval of each prediction. See this page for more information: For PSNR, to see why we use 60 dB to cap 8 bit and 72 dB to cap 10 bit, the rule-of-thumb formula is |
PSNR: capped at 60dB: |
Note that the uniform noise is the quantization noise resulted from 8-bit representation. It has nothing to do with the film grain or camera noise in the source. The 60 dB can be thought of as the fundamental limit of what a 8-bit representation can bring you. Anything beyond 60 dB is not sensible. |
60 seems a bit high for 8 bit video. |
Even for lossy video I got dB values above 60dB in some cases. |
More than 60 is meaningless for 8 bit video. 8 bit quantisation means you can't have a PSNR of greater than 58.9 dB. |
The quantization already happed much earlier in the workflow in the camera (14/16Bit) and later where we are downsampling from 12 /10 bit to 8 Bit. At the distribution(encoding)stage we have 8 or 10 Bit values per channel and there is an error based on this quantization, which is part of the noise and grain amplitudes but definality you can get dB values above 100dB for 8bpc images. |
By definition, you can not improve the channel performance beyond the quantisation noise limit. Therefore the maximum valid PSNR is the quantisation noise limit. This is easily calculable. For a value above that PSNR value to be valid, you'd need to have a non-uniform distribution of quantisation noise in the initial A/D conversion. |
The removal of video noise and grain will lead to PSNR values between 113dB and 58dB. Even with an additional lossy compression with high bitrates you will get PSNR values above 60 dB. For PSNR values below 60dB the noise and grain structure has been totally destroyed. For video distribution this matters! |
I repeat: By definition, you can not improve the channel performance beyond the quantisation noise limit. Therefore the maximum valid PSNR is the quantisation noise limit. This is easily calculable. Any number above this is not valid. |
I think at this point we can stop the conversation! Thank you for sharing your opinion. |
It's not an opinion. I've never seen a text discussing the derivation that doesn't mention this principle. It's an easy mathematical proof. |
In this case you have to provide the mathematical proof otherwise it's only your opinion |
Now I see the problem. Its your MSE calculation. The lowest mean squared error can be calculated by only changing the brightness of one pixel value by the value of one for 8 BIT: By the way, quantization noise is only one source of all the following possibilities:
Additionally you have grain, if you capture 35mm film Here are some real world examples: Wavelet Beam noise management and lossless compression: n:467 mse_avg:0.09 mse_y:0.05 mse_u:0.04 mse_v:0.30 psnr_avg:58.56 psnr_y:61.14 psnr_u:61.77 psnr_v:53.35 Wavelet Beam noise management and lossy compression H264 1920x1080 @ 7,4Mbit/s: Wavelet Beam noise management and lossy compression ProRes 1920x1080 |
Please help! |
First of all: Secondly, most encoder(-config)s use 10-bit internally - so by default, you get capped at least 70.98 and something dB. Thus it makes sense that you get values over 60dB. but i´ve never encountered a higher PSNR in an encoder-logfile. You should double check them. So how much bit does your encoder use internally? I´m really confused by both formulas you both gave - the both make sense to me to a certain degree. |
Hello Ruben, |
Of course given by the formula provided by you, you would get 0dB if for example one frame is black and the other is white and similiarly inf dB for pictures that are the same. |
PSNR is still my first choice, if I have to evaluate picture quality. PSNR values > 50dB will appear, if we are comparing the original picture verses e.g. the denoised picture or if we are evaluating intermediate formats like IMF. Additional we a using VMAF though the whole workflow and not only for the distribution (Glas-2-Glas). There are some tools out there which are also calculating PSNR values like FFMPEG. We have developed our own tools for Matlab ,c++ and CUDA. |
did you encounter higher values than those capped values (60 dB for 8 bit, 72 dB for 10 bit) given by @st599 formulas with ffmpeg? |
I have added a FAQ at: Hopefully this addresses the issue. |
We tested two different VMAF versions under windows. For the corner case where the reference video and the video under test (distorted video) is the same, we expected VMAF values of 100 for each frame and PSNR values around 116dB(8Bit). But we got VMAF values between 97 and 100 and a constant PSNR value of 60dB (For 10bit files we got a PSNR value of 72dB).
http://projekte.waveletbeam.com/VMAF_v2.jpg
http://projekte.waveletbeam.com/test8Bit_v2.csv
For PSNR maybe the formula 10Log(MAX/ROOT(MSE)) is used instead of 20Log(MAX/ROOT(MSE))
You also can double check the results using the Netflix clip checkerboard_1920-1080_10_3_0_0.yuv, which is located in the VMAF test folder.
Any help or opinions?
The text was updated successfully, but these errors were encountered: