Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RenderPassSsao: improve SSAO blur performance #6684

Closed
wants to merge 12 commits into from

Conversation

querielo
Copy link
Contributor

@querielo querielo commented Jun 10, 2024

This PR suggests splitting one large blur pass from "RenderPass based SSAO" into four smaller blur passes without compromising quality.

Main PR
Screenshot 2024-06-10 at 22 55 37 Screenshot 2024-06-10 at 22 54 22

Tested on MacBook 14" Pro (2021), used device.maxPixelRatio = window.devicePixelRatio; to increase resolution of framebuffers.

I confirm I have read the contributing guidelines and signed the Contributor License Agreement.

querielo added 2 commits June 10, 2024 22:41
… big pass (2 gaussian for high frequent signal, 2 interleaved for low frequency)
@querielo querielo changed the title SSAO: improve ssao blur performance RenderPassSsao: improve SSAO blur performance Jun 10, 2024
@willeastcott willeastcott requested a review from mvaligursky June 11, 2024 12:30
@willeastcott willeastcott added performance Relating to load times or frame rate area: graphics Graphics related issue labels Jun 11, 2024
@mvaligursky
Copy link
Contributor

Hi @querielo - that's definitely a good way to speed it up. But I wonder why doing 4 passes instead of just typical two separable passes?

@querielo
Copy link
Contributor Author

querielo commented Jun 11, 2024

Hi. @mvaligursky

The main idea is that the first two passes weaken the high-frequency signal. The next two passes are used to eliminate low-frequency signal (there are strided/interleaved blurs with a large step, applied diagonally).

Experimentally, it seems that increasing the filter kernel size is necessary to remove low-frequency SSAO patterns. For example, using a 17x17 kernel on my computer results in a noticeable slowdown, but it appears to achieve the effect of four passes.

The suggested four-pass approach is just one way to implement blurring. An advanced developer could add RenderPassDepthAwareBlur to afterPasses themselves, depending on the specific SSAO pattern generated.

4 passes 2 passes, 11x11 2 passes, 17x17
Screenshot 2024-06-11 at 21 12 36 Screenshot 2024-06-11 at 21 39 57 Screenshot 2024-06-11 at 21 11 34

By the low-frequency signal, I mean the pattern you can see in the next image (kernelSize=11)

Screenshot 2024-06-11 at 21 47 14

@mvaligursky
Copy link
Contributor

I had a bit of a play with your branch. My findings:

  • two pass BOX blur with 9 samples seems to be equivalent to the current 9x9 filter, as expected. I assume this is faster.
  • two pass GAUSSIAN blur with 9 samples is a lot lower quality compared to that - see especially around the yellow torches
  • four pass you had set up seems to match in quality with the two pass BOX with 9 samples - so I'm not sure we need to use 4 passes.

I think the main reason you need 4 passes is the use of the GAUSSIAN weights instead of BOX.

@mvaligursky
Copy link
Contributor

And so my recommendation is: switch it to 2 pass BOX filtering. We can expose the number of taps to the user as that controls the quality vs the cost.

@querielo
Copy link
Contributor Author

@mvaligursky I switched the blur to 2 pass BOX filtering.

@querielo
Copy link
Contributor Author

Offtop: It looks like 6f29e1b breaks the background.
Screenshot 2024-06-12 at 13 25 18

@@ -308,6 +351,8 @@ class RenderPassSsao extends RenderPassShaderQuad {
}

createRenderTarget(name) {
// TODO: consider using a pool of 2 texture buffers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considering we're down to 2 blurs, remove the comment

* @param {Vec2} [options.direction] - The direction of the blur. Defaults to (1, 0).
* @param {string} [options.channels] - The color channels to apply the blur to ('r'|'g'|'b'|'a'|'rg'|..|'ba'|'rgb'|'gba'|'rgba'). Defaults to 'rgba'.
*/
init(renderTarget = null, options = {}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not override the init function, but instead add a setup function which does all this.

totalWeight += ${weightCoefs[middle].toFixed(4)};`;

// TODO: move calculating UV coordinates to the vertex shader and pass them as varying
for (let i = 1; i <= kernelWidth; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're trying to move away from generating shader from javascript as much as possible, as it's a lot harder to understand and modify. We try to create a single shader string as much as possible, and then in code generate a list of defines to pre-pend. See RenderPassCompose as an example. It'd be great to modify this in a similar fashion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even though the shader here is trying to be pretty generic, which makes it harder (with the types and channels)

@mvaligursky
Copy link
Contributor

Offtop: It looks like 6f29e1b breaks the background.

It just got brighter as a result of this #6687 I suspect?

@mvaligursky
Copy link
Contributor

Looking much better, thanks!

I'd suggest to remove the option / API to change the blur type. Box looks better than Gaussian, and so the SSAO should just use Box. Only expose values that the user would benefit from adjusting.

uniform vec2 sourceInvResolution;
uniform int filterSize;
uniform vec2 direction;
uniform float kernel[KERNEL_SIZE];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hide the kernel behind #ifdef KERNEL to avoid the cost when the box filter is used

float diff = (sampleDepth - depth);
return max(0.0, 1.0 - diff * diff);
}
this.sourceInvResolutionId?.setValue(sourceInvResolutionValueTmp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for those ? there as those are transpiled to if .. we always set those up, so they're never undefined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VS Code TS checker highlighted this error, suggesting that sourceInvResolutionId could be undefined as it is not defined during creation. I'll check if Playcanvas linter highlights it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, linter highlights it, we have lots of these warnings in the engine, but they're not correct, but we don't see a way to remove them.

super.execute();
const defines = `#define KERNEL_SIZE ${this.kernelSize}\n`;

// CHECK: should we destroy the shader?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shaders are expensive to compile, so we don't destroy them, to make it faster when the shader is needed again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I will remove the comment.

// simple dithering helps a lot (assumes 8 bits target)
// this is most useful with high quality/large blurs
// ao += ((random(gl_FragCoord.xy) - 0.5) / 255.0);
this.updateShader();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally don't call updateShader directly, just set a _shaderDirty flag - this makes it easy to add more properties that modify the shader, and also better handle the case where the property is changed multi times per frame. (not that this is typical, but happens).

@querielo
Copy link
Contributor Author

querielo commented Jun 17, 2024

I'd suggest to remove the option / API to change the blur type. Box looks better than Gaussian, and so the SSAO should just use Box. Only expose values that the user would benefit from adjusting.

@mvaligursky Are you certain?
Stage: https://engine-m742xowmh-playcanvas.vercel.app/#/graphics/ambient-occlusion
To me, it seems that enlarging the kernel size of Box displaces dark areas from their correct position more quickly than hiding SSAO artifacts. However, Gaussian necessitates a larger kernel to conceal the SSAO pattern.

ssao_blur.mov

@mvaligursky
Copy link
Contributor

mvaligursky commented Jun 17, 2024

@mvaligursky Are you certain? Stage: https://engine-m742xowmh-playcanvas.vercel.app/#/graphics/ambient-occlusion To me, it seems that enlarging the kernel size of Box displaces dark areas from their correct position more quickly than hiding SSAO artifacts. However, Gaussian necessitates a larger kernel to conceal the SSAO pattern.

yeah interesting. It almost feel like the the bilateral weight function should be improved

float bilateralWeight(in float depth, in float sampleDepth) {
    float diff = (sampleDepth - depth);
    return max(0.0, 1.0 - diff * diff);
}

to detect the depth discontinuity better, as currently it blurs over those small edges. Maybe we can try something like this to have a control over it by the sigma value (untested, would need more research / testing)

float bilateralWeight(in float depth, in float sampleDepth, in float sigma) {
    float diff = (sampleDepth - depth);
    return exp(-diff * diff / (2.0 * sigma * sigma));
}

but agreed, the Gaussian blur in general should be slightly better, at a higher cost, so maybe leave it in.

@mvaligursky
Copy link
Contributor

Hi @querielo - have you had some time to look at these?

@mvaligursky
Copy link
Contributor

@querielo - please let me know if you have time to finished this PR. I'm keen to continue on some improvements too (#6658) but would prefer to avoid larger conflicts with the changes.

If you don't have time, I can take over this PR.

@MAG-AdrianMeredith
Copy link
Contributor

@querielo - please let me know if you have time to finished this PR. I'm keen to continue on some improvements too (#6658) but would prefer to avoid larger conflicts with the changes.

If you don't have time, I can take over this PR.

guess thats a no...

@mvaligursky
Copy link
Contributor

yep, it's on my list to get to in a week or so. Definitely not forgotten.

@querielo
Copy link
Contributor Author

querielo commented Jul 21, 2024

@mvaligursky Yes, you can take it. Sorry about that. I'm too busy right now and don't have time to fix it.

@mvaligursky
Copy link
Contributor

closing this due to #6870
Thanks @querielo

@mvaligursky mvaligursky closed this Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: graphics Graphics related issue performance Relating to load times or frame rate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants