New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meshlet single pass depth downsampling (SPD) #13003

Merged

alice-i-cecile merged 18 commits into bevyengine:main from JMS55:meshlet-spd

Jun 3, 2024

Contributor

JMS55 commented Apr 17, 2024 •

edited

Loading

Objective

Using multiple raster passes to generate the depth pyramid is extremely slow
Pulling data from the source image is the largest bottleneck, it's important to sample in a cache-aware pattern
Barriers and pipeline drain between the raster passes is the second largest bottleneck
Each separate RenderPass on the CPU is really expensive

Solution

Port FidelityFX SPD to WGSL, replacing meshlet's existing multiple raster passes with a ~~single~~ two compute dispatches. Lack of coherent buffers means we have to do the the last 64x64 tile from mip 7+ in a separate dispatch to ensure the mip 6 writes were flushed :(
Workgroup shared memory version only at the moment, as the subgroup operation is blocked by our upgrade to wgpu 0.20 Wgpu 0.20 #13186
Don't enforce a power-of-2 depth pyramid texture size, simply scaling by 0.5 is fine

JMS55 added A-Rendering C-Performance labels

JMS55 added this to the 0.14 milestone

JMS55 requested a review from pcwalton

April 17, 2024 01:54

JMS55 mentioned this pull request

Meshlet tracking issue #11518

Open

JMS55 changed the title ~~Meshlet single pass depth generation (SPD)~~ Meshlet single pass depth downsampling (SPD)

Contributor

pcwalton commented Apr 17, 2024

Which parts do you want me to review? Presumably downsample_depth.wgsl? Anything else?

Contributor Author

JMS55 commented Apr 17, 2024

The downsample shader, yeah. I can't post a link ATM, but you can find the full PR diff by changing the GitHub diff to compare against my meshlet-previous-frame-depth-pyramid branch.

pcwalton reviewed

View reviewed changes

Contributor

pcwalton left a comment •

edited

Loading

Overall a straightforward port of FidelityFX, but I'm concerned about the lack of texture barrier and some more comments explaining the approach would be nice. Seems fine other than that.

crates/bevy_pbr/src/meshlet/downsample_depth.wgsl Outdated Show resolved Hide resolved

crates/bevy_pbr/src/meshlet/downsample_depth.wgsl

+ tex = vec2(workgroup_id * 64u) + vec2(x * 2u + 32u, y * 2u + 32u);
+ pix = vec2(workgroup_id * 32u) + vec2(x + 16u, y + 16u);
+ v[3] = reduce_load_mip_0(tex);
+ textureStore(mip_1, pix, vec4(v[3]));

Contributor

pcwalton Apr 23, 2024

I'm not a big fan of the way there's so much repetition here, but I see that you're copying from FidelityFX, so I'm OK with it.

crates/bevy_pbr/src/meshlet/downsample_depth.wgsl

+ for (var i = 0u; i < 4u; i++) {
+ intermediate_memory[x][y] = v[i];
+ workgroupBarrier();
+ if local_invocation_index < 64u {

Contributor

pcwalton Apr 23, 2024

Ok, I had to stare at this a while. I assume what you're doing here is to work around the lack of subgroupQuadSwapX intrinsics. That's why you do local_invocation_index < 64 instead of local_invocation_index % 16 == 0 like FidelityFX does.

Can you add a comment saying that this is a workaround for lack of subgroup quad swap operations?

Contributor

pcwalton May 4, 2024

Oh wait, I guess this is similar to SpdDownsampleMips_0_1_LDS. OK, I see. Maybe add a TODO here so we can remember to update it when we have subgroup ops.

crates/bevy_pbr/src/meshlet/downsample_depth.wgsl Outdated

@group(0) @binding(0) var input_depth: texture_2d<f32>;

@group(0) @binding(1) var samplr: sampler;

/// Generates a hierarchal depth buffer.

/// Based on FidelityFX SPD https://github.com/GPUOpen-LibrariesAndSDKs/FidelityFX-SDK/blob/d7531ae47d8b36a5d4025663e731a47a38be882f/sdk/include/FidelityFX/gpu/spd/ffx_spd.h#L528

Contributor

pcwalton Apr 23, 2024

I'd add a brief comment here explaining the overall approach. Perhaps:

Every thread we dispatch is responsible for four 2x2 quads in the original depth buffer. Each workgroup has a shared 16x16 texel tile to work with. Different approaches are used at each mip level:
Mip level 0 (64x64): Use texture gather instructions to sample at the center of each 2x2 quad. Do this 4 times for the 4 quads we're responsible for. Save each downsampled depth value, effectively slicing each 32x32 piece into four 16x16 pieces.
Mip level 1 (32x32): For each of the four 16x16 pieces we have, copy it into the tile memory, then downsample it and barrier. This results in a 16x16 tile, which we store in our tile memory.
Mip levels 2-5 (16x16 to 2x2): The entire tile fits in memory, so we just downsample it in place.
Mip levels 6+: Load from the global texture and downsample 4 pixels at a time. Only one workgroup remains at this point.

crates/bevy_pbr/src/meshlet/downsample_depth.wgsl Show resolved Hide resolved

JMS55 marked this pull request as draft

April 24, 2024 01:11

Contributor Author

JMS55 commented Apr 24, 2024

Downscale is broken, will need to debug. Might be the barrier issue.

JMS55 mentioned this pull request

More mips per pass JolifantoBambla/webgpu-spd#1

Open

JMS55 added 12 commits

April 27, 2024 23:23


Rebased commit from previous branch

50b19fa


Misc comment formatting

42a5a96


Clippy

307c694


Switch back to writing buffers for instance and meshlet IDs per clust…

fe786e9

…er, but this time from a shader


Fix clippy lint

ceedd60


Add TODO

0b0ef43


Merge commit 'abddbf2d95ab3b0c37acfb11b34a9f323772fad1' into meshlet-…

…instance-only-data-upload-squashed


Misc

3856fbd


Merge commit '64c1c65783938facc59d9b36cbaa6deba435d84e' into meshlet-…

4a48aeb

…instance-only-data-upload-squashed


Panic when MeshletPlugin not supported


Rebase and fix SPD (now TPD sadly)

55d394f


Fix typo

da9e4e3

JMS55 force-pushed the meshlet-spd branch from 5477d9e to da9e4e3 Compare

May 3, 2024 22:57

JMS55 added 3 commits

May 4, 2024 14:26


Merge commit '77ebabc4fe0224565a2039bd9c0195901560b67f' into meshlet-spd

5e5975a


Misc

65379df


Revert "Misc"

e5fe862

This reverts commit 65379df.

JMS55 marked this pull request as ready for review

May 4, 2024 21:29

JMS55 requested a review from pcwalton

May 4, 2024 21:31

alice-i-cecile added S-Needs-Review S-Waiting-on-Author labels

pcwalton self-requested a review

May 16, 2024 22:30

pcwalton approved these changes

View reviewed changes

Contributor

pcwalton left a comment

Looks good now. Looking forward to this!

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

7952e78

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight` or `PointLight`. This shadow size value represents
the size of the light and should be tuned as appropriate for your scene.
Higher values result in a wider penumbra (i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, non-temporal operation, and temporal operation. The assets are
my original work.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

72f020b

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight` or `PointLight`. This shadow size value represents
the size of the light and should be tuned as appropriate for your scene.
Higher values result in a wider penumbra (i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, non-temporal operation, and temporal operation. The assets are
my original work.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

00ef49d

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight` or `PointLight`. This shadow size value represents
the size of the light and should be tuned as appropriate for your scene.
Higher values result in a wider penumbra (i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, non-temporal operation, and temporal operation. The assets are
my original work.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

27d5da7

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight` or `PointLight`. This shadow size value represents
the size of the light and should be tuned as appropriate for your scene.
Higher values result in a wider penumbra (i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, non-temporal operation, and temporal operation. The assets are
my original work.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

bebf696

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight` or `PointLight`. This shadow size value represents
the size of the light and should be tuned as appropriate for your scene.
Higher values result in a wider penumbra (i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

5cd3323

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight` or `PointLight`. This shadow size value represents
the size of the light and should be tuned as appropriate for your scene.
Higher values result in a wider penumbra (i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1. This was necessary to make the point light shadow map in the
example look reasonable, as otherwise the shadows appeared far too
aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

8f3feed

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene.  Higher values result in a wider penumbra
(i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

6421eb0

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene.  Higher values result in a wider penumbra
(i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Both temporal and non-temporal shadows are rather noisy in the example,
and, as mentioned before, this is unavoidable without downsampling the
depth buffer, which we can't do yet. Note also that the shadows don't
look particularly great for point lights; the example simply isn't an
ideal scene for them. Nevertheless, I felt that the benefits of the
ability to do a side-by-side comparison of directional and point lights
outweighed the unsightliness of the point light shadows in that example,
so I kept the point light feature in.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton mentioned this pull request

Implement percentage-closer soft shadows (PCSS). #13497

Merged

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

9aaa4b5

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene.  Higher values result in a wider penumbra
(i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Both temporal and non-temporal shadows are rather noisy in the example,
and, as mentioned before, this is unavoidable without downsampling the
depth buffer, which we can't do yet. Note also that the shadows don't
look particularly great for point lights; the example simply isn't an
ideal scene for them. Nevertheless, I felt that the benefits of the
ability to do a side-by-side comparison of directional and point lights
outweighed the unsightliness of the point light shadows in that example,
so I kept the point light feature in.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

03a5479

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene.  Higher values result in a wider penumbra
(i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Both temporal and non-temporal shadows are rather noisy in the example,
and, as mentioned before, this is unavoidable without downsampling the
depth buffer, which we can't do yet. Note also that the shadows don't
look particularly great for point lights; the example simply isn't an
ideal scene for them. Nevertheless, I felt that the benefits of the
ability to do a side-by-side comparison of directional and point lights
outweighed the unsightliness of the point light shadows in that example,
so I kept the point light feature in.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene.  Higher values result in a wider penumbra
(i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Both temporal and non-temporal shadows are rather noisy in the example,
and, as mentioned before, this is unavoidable without downsampling the
depth buffer, which we can't do yet. Note also that the shadows don't
look particularly great for point lights; the example simply isn't an
ideal scene for them. Nevertheless, I felt that the benefits of the
ability to do a side-by-side comparison of directional and point lights
outweighed the unsightliness of the point light shadows in that example,
so I kept the point light feature in.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene.  Higher values result in a wider penumbra
(i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Both temporal and non-temporal shadows are rather noisy in the example,
and, as mentioned before, this is unavoidable without downsampling the
depth buffer, which we can't do yet. Note also that the shadows don't
look particularly great for point lights; the example simply isn't an
ideal scene for them. Nevertheless, I felt that the benefits of the
ability to do a side-by-side comparison of directional and point lights
outweighed the unsightliness of the point light shadows in that example,
so I kept the point light feature in.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

pcwalton added a commit to pcwalton/bevy that referenced this pull request


Implement percentage-closer soft shadows (PCSS).

1d385d9

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene.  Higher values result in a wider penumbra
(i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR bevyengine#13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Both temporal and non-temporal shadows are rather noisy in the example,
and, as mentioned before, this is unavoidable without downsampling the
depth buffer, which we can't do yet. Note also that the shadows don't
look particularly great for point lights; the example simply isn't an
ideal scene for them. Nevertheless, I felt that the benefits of the
ability to do a side-by-side comparison of directional and point lights
outweighed the unsightliness of the point light shadows in that example,
so I kept the point light feature in.

Fixes bevyengine#3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

alice-i-cecile removed the S-Waiting-on-Author label

atlv24 approved these changes

View reviewed changes

Contributor

atlv24 left a comment

looks good. i've already looked through this code a few times before, just never left an approve

IceSentry added S-Ready-For-Final-Review and removed S-Needs-Review labels

IceSentry reviewed

View reviewed changes

crates/bevy_pbr/src/meshlet/gpu_scene.rs

Comment on lines +798 to +813

+ texture_depth_2d(),
+ write_only_r32float(),
+ write_only_r32float(),
+ write_only_r32float(),
+ write_only_r32float(),
+ write_only_r32float(),
+ texture_storage_2d(
+ TextureFormat::R32Float,
+ StorageTextureAccess::ReadWrite,
+ ),
+ write_only_r32float(),
+ write_only_r32float(),
+ write_only_r32float(),
+ write_only_r32float(),
+ write_only_r32float(),
+ write_only_r32float(),

Contributor

IceSentry Jun 2, 2024

I guess this could help a bit

Suggested change

 texture_depth_2d(),

 write_only_r32float(),

 write_only_r32float(),

 write_only_r32float(),

 write_only_r32float(),

 write_only_r32float(),

 texture_storage_2d(

 TextureFormat::R32Float,

 StorageTextureAccess::ReadWrite,

 ),

 write_only_r32float(),

 write_only_r32float(),

 write_only_r32float(),

 write_only_r32float(),

 write_only_r32float(),

 write_only_r32float(),

 // view depth

 texture_depth_2d(),

 // mip 1

 write_only_r32float(),

 // mip 2

 write_only_r32float(),

 // mip 3

 write_only_r32float(),

 // mip 4

 write_only_r32float(),

 // mip 5

 write_only_r32float(),

 // mip 6

 texture_storage_2d(

 TextureFormat::R32Float,

 StorageTextureAccess::ReadWrite,

 ),

 // mip 7

 write_only_r32float(),

 // mip 8

 write_only_r32float(),

 // mip 9

 write_only_r32float(),

 // mip 10

 write_only_r32float(),

 // mip 11

 write_only_r32float(),

 // mip 12

 write_only_r32float(),

IceSentry approved these changes

View reviewed changes

Contributor

IceSentry left a comment

I didn't go over the shader code, but I trust the other reviewers on that.

alice-i-cecile added this pull request to the merge queue

Merged via the queue into bevyengine:main with commit 5536079

32 checks passed

otoomey commented Jun 26, 2024

I believe that there might be an improvement that can be made here:

fn reduce_load_mip_0(tex: vec2u) -> f32 {
    let uv = (vec2f(tex) + 0.5) / vec2f(textureDimensions(mip_0));
    return reduce_4(textureGather(mip_0, samplr, uv));
}

From what I understand from the spec, textureGather already computes the four component minimum and stores it in the w channel. So there is no need to reduce, just do the following:

fn reduce_load_mip_0(tex: vec2u) -> f32 {
    let uv = (vec2f(tex) + 0.5) / vec2f(textureDimensions(mip_0));
    return textureGather(mip_0, samplr, uv).w;
}

It is very possible that the compiler spots this optimization already.

Frankly I'm not sure how textureGather works exactly, in Vulkan you needed a VK_SAMPLER_REDUCTION_MODE_MIN sampler and corresponding extension to do this.

Contributor Author

JMS55 commented Jun 27, 2024

textureGather does not return the minimum of the 4 values. The w component is the (u_min, v_min) value of the sample footprint. I.e. given 4 texels (the sample footprint) arranged in a 2x2 quad, the w component is the value at location (u_min, v_min).

Yes, ideally I would be able to use VK_SAMPLER_REDUCTION_MODE_MIN, but unfortunately wgpu does not support it.

github-merge-queue bot pushed a commit that referenced this pull request


Implement percentage-closer soft shadows (PCSS). (#13497)

2ae5a21

[*Percentage-closer soft shadows*] are a technique from 2004 that allow
shadows to become blurrier farther from the objects that cast them. It
works by introducing a *blocker search* step that runs before the normal
shadow map sampling. The blocker search step detects the difference
between the depth of the fragment being rasterized and the depth of the
nearby samples in the depth buffer. Larger depth differences result in a
larger penumbra and therefore a blurrier shadow.

To enable PCSS, fill in the `soft_shadow_size` value in
`DirectionalLight`, `PointLight`, or `SpotLight`, as appropriate. This
shadow size value represents the size of the light and should be tuned
as appropriate for your scene. Higher values result in a wider penumbra
(i.e. blurrier shadows).

When using PCSS, temporal shadow maps
(`ShadowFilteringMethod::Temporal`) are recommended. If you don't use
`ShadowFilteringMethod::Temporal` and instead use
`ShadowFilteringMethod::Gaussian`, Bevy will use the same technique as
`Temporal`, but the result won't vary over time. This produces a rather
noisy result. Doing better would likely require downsampling the shadow
map, which would be complex and slower (and would require PR #13003 to
land first).

In addition to PCSS, this commit makes the near Z plane for the shadow
map configurable on a per-light basis. Previously, it had been hardcoded
to 0.1 meters. This change was necessary to make the point light shadow
map in the example look reasonable, as otherwise the shadows appeared
far too aliased.

A new example, `pcss`, has been added. It demonstrates the
percentage-closer soft shadow technique with directional lights, point
lights, spot lights, non-temporal operation, and temporal operation. The
assets are my original work.

Both temporal and non-temporal shadows are rather noisy in the example,
and, as mentioned before, this is unavoidable without downsampling the
depth buffer, which we can't do yet. Note also that the shadows don't
look particularly great for point lights; the example simply isn't an
ideal scene for them. Nevertheless, I felt that the benefits of the
ability to do a side-by-side comparison of directional and point lights
outweighed the unsightliness of the point light shadows in that example,
so I kept the point light feature in.

Fixes #3631.

[*Percentage-closer soft shadows*]:
https://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf

## Changelog

### Added

* Percentage-closer soft shadows (PCSS) are now supported, allowing
shadows to become blurrier as they stretch away from objects. To use
them, set the `soft_shadow_size` field in `DirectionalLight`,
`PointLight`, or `SpotLight`, as applicable.

* The near Z value for shadow maps is now customizable via the
`shadow_map_near_z` field in `DirectionalLight`, `PointLight`, and
`SpotLight`.

## Screenshots

PCSS off:
![Screenshot 2024-05-24
120012](https://github.com/bevyengine/bevy/assets/157897/0d35fe98-245b-44fb-8a43-8d0272a73b86)

PCSS on:
![Screenshot 2024-05-24
115959](https://github.com/bevyengine/bevy/assets/157897/83397ef8-1317-49dd-bfb3-f8286d7610cd)

---------

Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Torstein Grindvik <52322338+torsteingrindvik@users.noreply.github.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Rendering C-Performance D-Modest S-Ready-For-Final-Review X-Uncontroversial