Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] A port of generation scripts from the mpv-prescalers project #803

Closed
wants to merge 22 commits into from

Conversation

hauuau
Copy link
Contributor

@hauuau hauuau commented Jan 6, 2024

This is a port of scripts by @bjin which are used to generate shaders in the mpv-prescalers repository.

I'd like some comments about viability of this approach.

I tried to keep things as close to the upstream as reasonably possible to make it easier to backport potential upstream changes in the future.

Performance doesn't seem to be different from already included versions but I haven't conducted much performance testing. It seems it could be possible to get some performance gains from playing with the --compute-shader-block-size option though but they're probably hardware specific.

I have tested generated shaders and they are functional. These include the new versions of RAVU-Zoom and RAVU-Lite with Anti-ringing built in. RGB versions are working too.

RAVU-Zoom generated by these scripts doesn't seem to suffer from the bug #554.

But I have unfortunately encountered something very similar to #516 during development and I'm utterly confused by it.
If I do something like

#define HOOKED_size float2(GetInputSize())

then the bug manifests itself.
But if I do something like

static const float2 HOOKED_size = float2(GetInputSize());

then the bug goes away for some reason.
That seems like some kind of compiler bug to me but I haven't investigated further.

That bug might be somehow related to either of these lines in RAVU-Zoom:

vec2 pos = HOOKED_size * HOOKED_map(ivec2(gl_GlobalInvocationID));

That's the line affected by difference in definition of HOOKED_size.

Or it might be this one:

ivec2 rectr = ivec2(floor(HOOKED_size * HOOKED_map(group_end) - 0.5)) + %d;

As soon as I lower that 0.5 coefficient to something like 0.4 the bug goes away. Might it be a rounding bug? I wonder why the upstream doesn't seem to be affected by it.

It's my first experience dealing with both GLSL and HLSL so I bet there are some mistakes or pitfalls I have fallen into.

@Blinue
Copy link
Owner

Blinue commented Jan 6, 2024

Thank you for your excellent work! To be honest, reviewing this PR is beyond my ability, and I think it would be better to make it a separate project (i.e., a fork of mpv-prescalers), and then contribute the generated effect files to this repository. Please note that the ongoing #643 will change the shader syntax again.

#516 is a bug that existed in the original version of RAVU, caused by insufficient sampling area of the original texture. It's very complex to implement effects that support arbitrary scaling factors with compute shaders.

@Blinue
Copy link
Owner

Blinue commented Jan 6, 2024

Considering that this PR is very complex, the review will take a long time, but I am busy implementing #643. If you insist on contributing to this repository, please be patient, and I will come back after 1.0 is released, including adapting to the new syntax and fixing bugs.

@hauuau
Copy link
Contributor Author

hauuau commented Jan 9, 2024

It may indeed be much easier to review and maintain changes to generation scripts in a separate repository as a fork of mpv-prescalers.
I have created it here and made necessary commits to bring it to the same state as this pull request.
Generation scripts for these shaders are in the source-magpie branch. This is a comparison between upstream and that branch: diff

Generated shaders for Magpie are in the magpie branch. This is a comparison between generated mpv and magpie versions: diff

I'll mark this pull request as a draft as this feels more appropriate in this case.

Do you have any preferences in how I should format pull requests for generated effects (single commit, separate commits, separate pull requests)?

I'll take a look at #643 and will make necessary changes. I'm fine with deferring this to after that gets merged.

If you have any requests for other changes I should make to generation scripts to make generated effects more fit to inclusion, I'll happily consider implementing them.

I'll investigate the bug in the original RAVU-Zoom further and will try contacting upstream about it.

@hauuau hauuau marked this pull request as draft January 9, 2024 01:58
@Blinue
Copy link
Owner

Blinue commented Jan 9, 2024

Do you have any preferences in how I should format pull requests for generated effects (single commit, separate commits, separate pull requests)?

Please open a new pull request and copy the generated effect files to the RAVU and NNEDI3 folders.

I noticed that the generated effect files are all under LGPL. Is it possible to re-license them as GPL, just like the other effects in Magpie?

hauuau added a commit to hauuau/magpie-prescalers that referenced this pull request Jan 10, 2024
@hauuau
Copy link
Contributor Author

hauuau commented Jan 10, 2024

I have switched license of generated shaders to GPL3+ as relicensing from LGPL3+ to GPL3+ is allowed according to the GPL Compatibility Matrix.

I'll submit separate pull requests for both v3 and v4 generated shaders.

@Blinue
Copy link
Owner

Blinue commented Jan 10, 2024

Awesome! Let's switch over to the new PRs, I think this one can be closed.

@hauuau
Copy link
Contributor Author

hauuau commented Jan 10, 2024

Sure.

@bjin
Copy link

bjin commented Jan 10, 2024

As soon as I lower that 0.5 coefficient to something like 0.4 the bug goes away. Might it be a rounding bug? I wonder why the upstream doesn't seem to be affected by it.

Could you verify that setting the coefficient to 0.49999 fixes the issue? If so, it can be confirmed as a float rounding issue.

AFAICT, it's very difficult to trigger such floating rounding issue. First, it has to be perfect odd integer scaling as a starter, such as 1x or 3x upscaling, to have texel centers somehow matches before and after ravu-zoom. And considering group size is set to 32x8, 3x or 5x upscaling won't hit the issue as well. So, overall, this bug likely will only be triggered if the scaling factor is exactly 1. And in mpv, I have a condition check to bypass ravu-zoom shader in such case //!WHEN HOOKED.w OUTPUT.w < HOOKED.h OUTPUT.h < *.

Second, the shader compiler has to do some kind of optimization so that the following two statements was handled differently for the same ivec2 ID. In mpv, all coordinates floats are configured to use high precision IIRC.

ivec2 rectr = ivec2(floor(HOOKED_size * HOOKED_map(ID) - 0.5)) + ...;

vec2 pos = HOOKED_size * HOOKED_map(ID);
ivec2 ipos = ivec2(floor(pos)) - rectl

This is just a quick look so I might made some mistakes.

@hauuau Could you confirm that:

  1. Setting the number to 0.49999 fixes the issue?
  2. Magpie applies ravu-zoom even on 1x scaling factor?
  3. This issue can only be triggered with 1x scaling factor?
  4. I don't know HLSL well, but could you try turning some kind of optimization of with pragma like #pragma optionNV(fastmath off) or #pragma optionNV(fastprecision off) for Nvidia cards?

@hauuau
Copy link
Contributor Author

hauuau commented Jan 10, 2024

  1. Setting the number to 0.49999 fixes the issue?

Setting coefficient in the rectr calculation to 0.49999 or 0.499999 seems to fix the issue. It comes back at 0.4999999.

  1. Magpie applies ravu-zoom even on 1x scaling factor?
  2. This issue can only be triggered with 1x scaling factor?

Magpie allows applying ravu-zoom and other arbitrary scaling shaders on 1x scaling but 1x doesn't seem to trigger this issue for me. For me it's usually triggered with 3x scaling (1280x720 → 3840x2160). The issue manifests itself either as random noise on a grid of block size or at least with 32x8 block size it's more often a horizontal 1px line of random noise 8px from the top of the screen.

  1. try turning some kind of optimization of with pragma like #pragma optionNV(fastmath off) or #pragma optionNV(fastprecision off) for Nvidia cards?

I'm not on NVidia hardware so optionNV pragmas do nothing for me and I have failed to find similar ones for AMD. I'm not sure if they work on HLSL or not. I'm very new to this stuff. And they are not mentioned in Microsoft's HLSL documentation.

HLSL has precise keyword to turn off optimizations during calculations of designated variables but it seems to have quite weird effect on this bug. If I declare rectr as precise then the bug goes away but if I try declaring both rectr and rectl as precise then it comes back. Maybe I need to try poking more variables with this keyword but I haven't tried further.

@bjin
Copy link

bjin commented Jan 10, 2024

@hauuau I tried 3x upscaling with mpv ravu-zoom just now, and somehow I'm still unable to reproduce. Guess it highly depends on shader compiler then.

Anyway, thanks for your tests. Able to verify that it's a float rounding issue is great.

@hauuau
Copy link
Contributor Author

hauuau commented Jan 10, 2024

@bjin
Even on HLSL with Magpie it seems to depend a lot on how different variables in those calculations are declared (even without precise keyword). Sometimes it seems to go away if temporary variable is inserted somewhere in this calculation.
I failed to reproduce it on mpv too.

Should lowering coefficient in the rectr calculation be considered a proper fix in this case?

@bjin
Copy link

bjin commented Jan 10, 2024

Should lowering coefficient in the rectr calculation be considered a proper fix in this case?

Yes, rectr is only used to pre-fetch sampling area, so it's safe to be a bit larger, as long as it's smaller than the size of shared samples[]. And samples[] are already declared slightly larger even for 1x scaling ((H+6)*(W+6) instead of (H+5)*(W+5)).

@hauuau
Copy link
Contributor Author

hauuau commented Jan 10, 2024

@bjin
Thank you very much! This helped me a lot.

bjin added a commit to bjin/mpv-prescalers that referenced this pull request Jan 10, 2024
While I couldn't reproduce this issue using mpv, it has been observed in
the HLSL port of RAVU-Zoom for Magpie. Instead of digging and trying to
figure out how different shader compilers handle floats differently, it's
always easier just to stay on the safe side.

Thanks to @hauuau for report and testing work.

ref: Blinue/Magpie#803
bjin added a commit to bjin/mpv-prescalers that referenced this pull request Jan 10, 2024
While I couldn't reproduce this issue using mpv, it has been observed in
the HLSL port of RAVU-Zoom for Magpie. Instead of digging and trying to
figure out how different shader compilers handle floats differently, it's
always easier just to stay on the safe side.

Thanks to @hauuau for report and testing work.

ref: Blinue/Magpie#803
@Blinue
Copy link
Owner

Blinue commented Jan 10, 2024

I can reliably reproduce this error at odd integer scaling (1x, 3x, 5x, etc.), where the calculation result of floor(HOOKED_size * HOOKED_map(group_end) - 0.5) becomes unstable due to floating-point error. Changing 0.5 to 0.49 might be a better solution, as it overcomes the error effect and has almost no performance loss.

@kasper93
Copy link

kasper93 commented Mar 4, 2024

I was debugging some other issue, but found interesting insight how (gather) sampling location is computed. [1] In short, sampling location is converted to at least 16.8 fixed point and while in this case it is gather specific, the same apply to linear scaling [2]. There is also in [1] interlude about nearest filtering.

And sure with linear filtering it shouldn't cause a problem, but I found it interesting detail which can produce strange results, in specific cases. So I felt like sharing it also here, if anything as a curiosity. Maybe for GPU experts this is obvious knowledge, but for me it was interesting :)

[1] https://www.reedbeta.com/blog/texture-gathers-and-coordinate-precision/
[2] https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#7.18.8%20Linear%20Sample%20Addressing

Pixels and polygons and shaders, oh my!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants