Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split rendering of shadowmaps into a static and dynamic pass #77683

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BastiaanOlij
Copy link
Contributor

@BastiaanOlij BastiaanOlij commented May 31, 2023

This PR changes the shadow map rendering logic so static objects (objects marked as not moving) are rendered to a static shadow map. This static shadow map is only updated if the light changes or if the light is assigned a new region in the shadow map.

This PR is now rebased on top of #77085 as I'm building on some of the debug changes in that PR

This functionality can be enabled through a new project settings:

image

Each frame the static shadow map is copied and we render dynamic objects (objects marked as potentially moving) into this shadow map. This is this shadow map that is used to render shadows.

This removes a lot of overhead in rendering shadow maps when there are many static objects.

Geometry can be marked as static or dynamic to determine whether they need to be rendered to shadow maps every frame (if applicable). There is an automatic mode that, with a little overhead, figures out which objects move after the static shadow map is rendered and marks those objects as dynamic automatically. After a few frames dynamic objects are identified and the efficiency is applied.

image
The property name is currently named Render shadow.
See the discussion in the comments about possible changes with an approach for using an "inherit from parent" approach though the automatic mode seems to work well enough.

Note if animations are linked to a mesh, we automatically recognize it as dynamic.

Our debug display will overlap the static and dynamic results.

  • Static shadows are shown in black and white (black is close)
  • Dynamic shadows are shown in black and blue (blue is close)

image

Currently only the Vulkan renderer supports this feature.

Todos:

@mrjustaguy
Copy link
Contributor

This resolves godotengine/godot-proposals#4635, and which is one part of godotengine/godot-proposals#6948

@Saul2022
Copy link

And this issue partially i think. #73411

@lawnjelly
Copy link
Member

lawnjelly commented May 31, 2023

Just food for thought, if marking objects as dynamic / static, could it be an idea to future proof this and have a generic flag rather than labelling it specifically for shadows?

I know believe 😄 historically @reduz was against static / dynamic flags, but once you add them, they open up the door to a bunch more optimizations, and we probably don't want to end up in situation with multiple static / dynamic flags.

(NOTE: This is badly remembered second-hand info from @Calinou from a few years ago so could be "lost in translation" 😄 . This could have been referring to a particular case, and besides, Godot has evolved a lot over the years. If I remember right the idea was to keep Godot as easy to use as possible and with as few options as possible to keep things simple. This is understandable - as time goes on there are inevitably more and more settings - paralysis of choice is a real problem.)

(An alternative which I think @Calinou has suggested in the past is to automatically detect static objects, but that could be a bit cumbersome in some cases.)

It might also be an idea to have the static / dynamic flag as inherited / static / dynamic, with default inherited, that way users could easily mark a branch as static without marking every mesh.

@Calinou
Copy link
Member

Calinou commented May 31, 2023

Just food for thought, if marking objects as dynamic / static, could it be an idea to future proof this and have a generic flag rather than labelling it specifically for shadows?

I think this new flag makes sense to be called a generic rendering flag, but there's definitely some demand for keeping the GI and general rendeering flags separate (cc @EMBYRDEV). How do other engines handle this in comparison?

@BastiaanOlij
Copy link
Contributor Author

@lawnjelly indeed @clayjohn was discussing with me the merits of introducing a flag for broader use. I kept it straight forward for now just to get the basics working and tested but we can see where we want to take this.

I'm interested to know what @reduz suggestion was to not have the flag. We can infer what objects are dynamic to some extend, especially if they are children of physics objects, but there are a lot of situations where we won't know if an object will move until code moves it. The overhead of keeping track of which objects are rendered to a light and which have changed from static to dynamic would be intense not to mention we don't capture situations where an object is just moved one time or as part of a level change.

@JoNax97
Copy link
Contributor

JoNax97 commented May 31, 2023

How do other engines handle this in comparison?

Unity has a single per-object static checkbox, that actually sets a bunch of flags for different systems. You can also set it per-system.

GameObjectStaticDropDownMenu1

When changing these values, the user is asked whether to assign it to the current object only, or to its children as well.

Personally, I think it's a nice workflow. For basic uses, it's a simple "toggle this for free optimization" check, while also offering advanced options.

@BastiaanOlij
Copy link
Contributor Author

Right now I'm going in the opposite direction, everything is static by default and you need to mark dynamic objects, seeing 90% of your scene will be static, this way the amount of work is lessened.
The problem is that it also means things are "broken" until you mark the correct objects as dynamic.

I do believe the answer lies with good IDE functions, configuration warnings if your a child of a physics body and dynamic is not selected, indeed the ability to make a whole tree as dynamic/static.

Maybe this should work like our pause function, where the default mode is "inherited from parent" and we go down the tree. If we come all the way down to the root level it's static, if we encounter a physics body it's dynamic, if we encounter a parent that has static of dynamic set, we use that?

@bitsawer
Copy link
Member

bitsawer commented May 31, 2023

Maybe this should work like our pause function, where the default mode is "inherited from parent" and we go down the tree. If we come all the way down to the root level it's static, if we encounter a physics body it's dynamic, if we encounter a parent that has static of dynamic set, we use that?

I was just about to suggest this exact thing. With this, it would be pretty easy to create scene configurations that "just work". For example making basic Statics / Dynamics nodes near the root. You would then have to enable the dynamic mode only on a single node and the children would then automatically inherit that (unless they specify otherwise).

statics

@jcostello
Copy link
Contributor

jcostello commented May 31, 2023

What happen when using lightmaps and what about harirchy? For example in a complex scene you have to go to all meshes and set them static even if a parent node is set to static? Could this be done not in a mesh but in a 3DNode and children inherit from it?

@mrjustaguy
Copy link
Contributor

I see no reason why not

@Saul2022
Copy link

I do believe the answer lies with good IDE functions, configuration warnings if your a child of a physics body and dynamic is not selected, indeed the ability to make a whole tree as dynamic/static.

Not sure, i think it should be automatic, to avoid the user having to click one by one the dynamic mesh.
Also what would happen with 4.0 projects opened in 4.1, will they be static automatically?

@Calinou
Copy link
Member

Calinou commented May 31, 2023

Also what would happen with 4.0 projects opened in 4.1, will they be static automatically?

Yes, but this feature isn't slated to be merged for 4.1 due to feature freeze.

@JoNax97
Copy link
Contributor

JoNax97 commented May 31, 2023

everything is static by default and you need to mark dynamic objects

I agree with of your comment, except for this part.
Making 'static' the default will lead to confusion and unexpected behaviors, specially for newcomers.

If I place a 3d object on scene and move it by code, the behaviour should be correct by default. as I see it, performance is "optional", correctness is not.

@BastiaanOlij
Copy link
Contributor Author

If I place a 3d object on scene and move it by code, the behaviour should be correct by default. as I see it, performance is "optional", correctness is not.

The problem though is that you're introducing a hell of a lot of workload, the vast majority of objects in a scene will be static.
Not setting an object to dynamic that should be dynamic has an immediate visual effect that the user can correct.
Not setting an object to static that should be static goes completely unnoticed, it becomes like finding a needle in a haystack with no queue's to the user that there is performance to be gained.

I think the idea to inherit the behaviour has a lot of merit especially when some of this is automatic (i.e. a mesh instance as a child of a rigid body becomes dynamic automatically) but I was originally planning on only doing this for geometry nodes.
We could make it a property of Node3D I guess but that may lead to confusion when there is a property that doesn't do anything on most nodes other than pass information up the tree.

@jcostello
Copy link
Contributor

What happen when using lightmaps and what about harirchy? For example in a complex scene you have to go to all meshes and set them static even if a parent node is set to static? Could this be done not in a mesh but in a 3DNode and children inherit from it?

Any input on this?

@BastiaanOlij
Copy link
Contributor Author

What happen when using lightmaps and what about harirchy? For example in a complex scene you have to go to all meshes and set them static even if a parent node is set to static? Could this be done not in a mesh but in a 3DNode and children inherit from it?

Any input on this?

Right now nothing much to say because we're separating the settings for GI/lightmaps and normal lights.

To the best of my knowledge, when you mark a light as having to be baked, that light no longer gets evaluated as the light information is now coming from the lightmap or from VoxelGI or SDFGI. So that light in effect is only static at that point in time.
Though I might be overlooking something here, I'm not sure if dynamic objects are still handled through shadowmaps, I haven't dived into this much.

If that is the case, only objects marked as dynamic for GI will be handled by the normal shadowmap system which will then divide that remainder in what needs to be rendered into the static shadow map (hopefully nothing) and what goes into the dynamic shadow map. There is no additional overhead if nothing is rendered to the static shadowmap, the logic is able to skip over that nicely.

That is a bigger discussion however and comes down to whether it makes sense to have a separate setting for GI and for shadowmaps, or if we should have a single setting that just markes a node as static or dynamic.

@Lielay9
Copy link
Contributor

Lielay9 commented Jun 1, 2023

The problem though is that you're introducing a hell of a lot of workload, the vast majority of objects in a scene will be static. Not setting an object to dynamic that should be dynamic has an immediate visual effect that the user can correct. Not setting an object to static that should be static goes completely unnoticed, it becomes like finding a needle in a haystack with no queue's to the user that there is performance to be gained.

The default could always be different for things created inside Godot and imported scenes. I doubt the performance really matters in level blockouts and I'd rather have my crappy tween driven cube doors and other placeholders work without any extra effort. Would also mean you'd only have to worry about the value during import process which I think would be the natural time and place to set the value anyway (good spot to do any fancy auto-detection too).

@Calinou
Copy link
Member

Calinou commented Jun 1, 2023

Though I might be overlooking something here, I'm not sure if dynamic objects are still handled through shadowmaps, I haven't dived into this much.

When using LightmapGI, both objects with static and dynamic GI cast shadows (and therefore incur shadow draw calls). In a scenario with fully baked lights, this allows dynamic objects to receive accurate shadows from the world.

This however has more performance overhead compared to skipping light rendering entirely for static lightmapped lights. Eventually, we'll need to add a 4th "Fully Static" GI mode that skips light rendering entirely (except when baking the lightmap) and bakes the direct light in the dynamic object capture system (lightmap probes). This could also be supported for VoxelGI and SDFGI, but it's not as important there due to the light data's density generally being too low. The current way of rendering static lights in LightmapGI should remain available as an option though, since it allows for accurate shadow casting onto dynamic objects (which can be valuable in many cases but with better performance than baking indirect light only).

@BastiaanOlij BastiaanOlij force-pushed the split_static_dynamic_shadowmap branch 2 times, most recently from 9e87a3a to fd258a8 Compare June 5, 2023 04:20
@BastiaanOlij
Copy link
Contributor Author

Ok omni and spot lights now render their shadow maps properly with the dynamic and static split. Cubemap omni lights still have their two step approach but I think I found a good compromise here in the approach.
I've also updated displaying the shadow atlas to show what part of the map reflects dynamic objects by drawing those in blue and inverting their depth.

@BastiaanOlij BastiaanOlij force-pushed the split_static_dynamic_shadowmap branch 2 times, most recently from 0b98f39 to f78845a Compare June 5, 2023 05:29
@BastiaanOlij
Copy link
Contributor Author

Still trying to find an issue with the dynamic cubemap being copied in the wrong order..

@reduz
Copy link
Member

reduz commented Aug 10, 2023

As I mentioned before, I think to merge this PR, it has to be shown that it really can improve performance vs simply dropping dynamic objects in complex scenes. My experience doing these things is that the bandwidth burden of copying and bookkeeping a static shadow map can be greater or equal to the savings of simply not rendering those objects. This is because the lower end devices that are benefited with not drawing the objects (IGP in general) also have limited memory bandwidth to cope with copying such large shadow buffers.

On top of this, on mobile, this behavior is absolutely abysmal in performance because it means a framebuffer load at the beginning of the shadow rendering, which incurs in massively more bandwidth used than just rendering the objects for TBDR architectures.

Additionally to this, in general in modern desktop GPUs, while shadow maps are being rendered, compute and other activities can be done at the same time. This may break this parallelism somehow (which was working at some point, would need to be checked again now).

To sum up, In general:

  • To me, this is a high-end desktop only optimization, it won't solve the low-end desktop problem. Benchmarks would be fantastic to prove me wrong. Mobile is out of the picture.
  • I would personally opt for a simpler solution for low end and mobile which is, just dropping dynamic objects beyond a certain distance from the shadow map and simply not update it.
  • For higher end devices, remember we can also take advantage of parallelism, so shadow rendering is done at the same time as compute.

@Saul2022
Copy link

Saul2022 commented Aug 10, 2023

I don´t know if this can work as a performance test, but from my igpu point, seem´s like the dynamic and static flag don´t seem to affect performance, though my igpu has an amd radeon vega 3 driver,static meshes(not moving) don´t gain any performance when it on static flag.
shadow-perf-test.zip

Edit for context:

editor_screenshot_2023-08-10T184006
editor_screenshot_2023-08-10T184040

@mrjustaguy
Copy link
Contributor

sure as it isn't re-rendering the shadow maps, have some dynamic object that moves around the lights, if you got a heavy scene it'd kill performance with no static lights.

@Saul2022
Copy link

Saul2022 commented Aug 10, 2023

Ok got another scene this time with more cubes. and a character moving with dynamic shadows and a omni light on it and it the shadow2test one
shadow-perf-test (2).zip

the one with more fps is the static. shadows

IMG20230810194222
IMG20230810194152

@mrjustaguy
Copy link
Contributor

Good, so it benefits potato IGPUs.. Will likely translate even better to more complex scenes on higher end hardware, as long as it doesn't run into issues with parallelism, which afaik might be an issue with advanced features such as SDFGI which run in parallel with shadow rendering if I remember correctly...

@BastiaanOlij
Copy link
Contributor Author

Cool benchmark @Saul2022 , the one I was working on was far more elaborate but I might add the things I wanted to add to yours.

@reduz I agree with a lot of what you state here. For mobile the copy of the static map is very expensive and may outweigh any benefit. Indeed this seems mostly a technique that applies to high end.

The other thing that I started to realise more and more as I was working on a benchmark project is that most lights don't cover a lot of objects to begin with, you may only win performance in very complex environments where lights cover a lot of ground. We need a far more stable algorithm to ensure lights use the same quadrant of the atlas as long as possible. Right now they shift too easily even when its not required.

That all said, the logic to autodetect which objects are dynamic and which are static may remain useful. Especially if we go to an approach where we rendering static lights for our stepped directional lights. The problem I have at the moment is that once the list is split to static and dynamic objects, it becomes harder to combine them.

@Saul2022
Copy link

@reduz I agree with a lot of what you state here. For mobile the copy of the static map is very expensive and may outweigh any benefit. Indeed this seems mostly a technique that applies to high end.

Maybe what could be done is first merge this pr and then later on for mobile us igpu, make what juan said on not re render shadows at a distance, though that parameter could be a customized sething in the shadow properties or at the project sethings, this could get the best of both worlds, and help the stepped directional shadow pr.

@mrjustaguy
Copy link
Contributor

mobile means phone, not laptop... they use vastly different rendering architectures and philosophies, due to vastly different constraints

@Saul2022
Copy link

mobile means phone, not laptop... they use vastly different rendering architectures and philosophies, due to vastly different constraints

I know, what i mean is applying what juan said for the specs he mentioned(as he mentioned that method for low end and mobile) on a follow up pr, and merge this one for the stepped directional shadows pr.

@Calinou
Copy link
Member

Calinou commented Aug 21, 2023

OS: Fedora 38
CPU: Intel Core i9-13900K
GPU: GeForce RTX 4090
Resolution: 1152×648 (and 3840×2160 for fullscreen tests)

In @Saul2022's MRP, this PR unfortunately causes performance regressions on both a debug editor build and a release export template build. This is due to a CPU bottleneck as evidenced by MangoHud readouts, as the GPU sees much lower utilization:

5444afa This PR (rebased against 5444afa)
Screenshot_20230821_020615 Screenshot_20230821_020602

Other performance metrics:

Type 5444afa This PR (rebased against 5444afa)
1152×648 static 5380 FPS (0.18 mspf) 1683 FPS (0.59 mspf)
3840×2160 static 974 FPS (1.02 mspf) 848 FPS (1.17 mspf)
1152×648 moving capsule back and forth 5323 FPS (0.18 mspf) 1663 FPS (0.60 mspf)
3840×2160 moving capsule back and forth 956 FPS (1.04 mspf) 814 FPS (1.22 mspf)

This may not be as much of an issue in scenes with more complex geometry, but it's something to keep in mind despite the final framerate still being easily above 500 FPS.

PS: The capsule's shadow looks very slightly different with this PR too. This may hint at a sampling/filtering issue, though it's not critical.

protected:
public:
DebugEffects();
~DebugEffects();

void draw_shadow_frustum(RID p_light, const Projection &p_cam_projection, const Transform3D &p_cam_transform, RID p_dest_fb, const Rect2 p_rect);
void draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 p_rect);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 p_rect);
void draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 &p_rect);

I'd pass by reference here

@BastiaanOlij
Copy link
Contributor Author

@Calinou thanks for testing, that kind of matches up with some of my own testing which is why I've let this sit for a bit and been focussing on other parts.

At the moment I am unsure of whether this approach simply doesn't provide the benefits we hoped for (as Reduz has indicated in the past), or whether there are faults in the implementation.

I do plan to return to this after some of my other workload is behind me and take a fresh look at it.

If all else fails, I may bring this back in feature set to match suggestions made by Reduz and purely focus on removing dynamic objects from shadow maps, we'll see.

I'm going to put this PR back on draft until we have time to sort this out.

@BastiaanOlij BastiaanOlij marked this pull request as draft September 15, 2023 01:44
@BastiaanOlij BastiaanOlij modified the milestones: 4.2, 4.x Sep 15, 2023
@Calinou
Copy link
Member

Calinou commented Sep 15, 2023

If all else fails, I may bring this back in feature set to match suggestions made by Reduz and purely focus on removing dynamic objects from shadow maps, we'll see.

I'd recommend opening a separate PR if you do this, so that this particular work isn't lost if we decide to bring it back in the future.

@mrjustaguy
Copy link
Contributor

Simply removing dynamic objects from shadow maps can easily be done by the user, by turning "Casts shadows" to off, and the only thing that would need to be done is stepped directional shadow updating at 1 cascade a frame, which would be a slight tweak of the other PR.

However I do have to point out, that the GPU used for Testing is Insanely powerful, great for triangle pushing and the scene is still very simple.

This should be tested in a scene where the GPU is struggling to keep up with rendering all the Static Shadows, not where it's sleeping while doing so, else you're guaranteed to not see any benefits from this and just the overhead of it.

@BastiaanOlij
Copy link
Contributor Author

@mrjustaguy very true, though on a fast GPU you would expect only a marginal cost or marginal improvement. In my testing I had scenarios where on a fast GPU split shadows were 30% slower.

The problem I had is that I couldn't identify why:

  • did I screw something up in the approach
  • was this slowdown due to something outside of our control
  • was there something wrong with how I measured performance, I did multiple runs and got totally different numbers each time to a point that I totally lost trust in how things were being measured.

I'm very positive that I can figure it out, but at the time I had been working nonstop on this for 2 months and frankly it started to burn me out. So I needed to take a break from it and come back when I can review it with a fresh perspective.

@Calinou indeed, when I get back to things I'll probably split of bits that I know are good and useful off into separate PRs, I do NOT want to loose the work I've done here. For all I know there is just a mistake in the approach and once we figure that out, all the work here can be fully salvaged.

@mrjustaguy
Copy link
Contributor

After some time Working on my Project I have found one potentially massive application for this, though for Directional Shadows only with Stepped Cascade updating.

Heavy Forest Areas absolutely murder DS Rendering performance especially at higher Shadowmap Resolutions, due to all the Alpha Foliage it's rendering..
In some of my test scenes the Shadow Map Rendering takes 20ms (at 8k SM) on it's own just due to all those semi-transparent quads, scaling nearly linearly to 78ms when upping the res to 16k SM (Note that tis is on an RX 6600)

An easy way to simulate this behavior for testing is to put on a noise texture on a massive quad, use alpha scissoring, and duping it a couple of dozen layers one on top of the other.

@@ -1253,6 +1258,15 @@ void RendererSceneCull::instance_geometry_set_cast_shadows_setting(RID p_instanc
_instance_queue_update(instance, false, true);
}

void RendererSceneCull::instance_geometry_set_shadow_mode(RID p_instance, RS::ShadowDynamicMode p_shadow_dynamic_mode) {
Instance *instance = instance_owner.get_or_null(p_instance);
ERR_FAIL_COND(!instance);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ERR_FAIL_COND(!instance);
ERR_FAIL_NULL(instance);

</constant>
<constant name="SHADOW_RENDER_STATIC_MAP" value="1" enum="ShadowRenderSetting">
This instance is rendered to static shadow maps. This speeds up shadow map rendering for objects that do not move.
Note: if the objects position or orientation changes, its shadow will not be updated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note: if the objects position or orientation changes, its shadow will not be updated.
[b]Note:[/b] If the objects position or orientation changes, its shadow will not be updated.

@@ -326,3 +338,27 @@ void DebugEffects::draw_shadow_frustum(RID p_light, const Projection &p_cam_proj
}
}
}

void DebugEffects::draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 p_rect) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void DebugEffects::draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 p_rect) {
void DebugEffects::draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 &p_rect) {

@atirut-w
Copy link
Contributor

Geometry can be marked as static or dynamic to determine whether they need to be rendered to shadow maps every frame (if applicable). There is an automatic mode that, with a little overhead, figures out which objects move after the static shadow map is rendered and marks those objects as dynamic automatically. After a few frames dynamic objects are identified and the efficiency is applied.

This could cause problems when there are objects that only move with interacted with, so it should probably be discouraged. Maybe it should detect from the GI mode instead? It's pretty surefire in my opinion. If a mesh's GI mode is static, you know it doesn't move, so you can safely give it static shadows.

@Capewearer
Copy link

I know it's basically begging, but could this draft be accounted for 4.4 release? First 4.4 snapshot looked very cool with its rendering performance improvements, so final release might be much cooler in comparison with 4.3.
Like separate static shadow pass CAN be very useful even in case of games with dynamic environments (e.g. destructible ones, like buildings in Battlefield or blocks in Minecraft), because not all destructibles are necessary something movable. They are often static meshes which produce "gibs" after destruction. In such case you could benefit.
And many AAA games use that approach (e.g. Doom 2016).

@clayjohn
Copy link
Member

clayjohn commented Sep 5, 2024

I know it's basically begging, but could this draft be accounted for 4.4 release? First 4.4 snapshot looked very cool with its rendering performance improvements, so final release might be much cooler in comparison with 4.3. Like separate static shadow pass CAN be very useful even in case of games with dynamic environments (e.g. destructible ones, like buildings in Battlefield or blocks in Minecraft), because not all destructibles are necessary something movable. They are often static meshes which produce "gibs" after destruction. In such case you could benefit. And many AAA games use that approach (e.g. Doom 2016).

This PR needs some deep investigation to figure out exactly why it isn't performing as we expect. Because this feature comes with a lot of tradeoffs (it adds significant complexity to users in exchange for better performance), it needs to have really good performance to be worth merging.

We just aren't able to invest enough time into it right now to get it where we want it to be. It is certainly something we want to see finished eventually, but we have other optimizations that are higher priority right now

@Capewearer
Copy link

Capewearer commented Sep 10, 2024

@clayjohn even though you claim about other priorities (that's valid reason), I would want to take some words in defense of investigating this PR further.
At first, this option is toggleable, so anyone who gets performance regression may turn it off (just like Depth Prepass, its use-case is very specific, yet very common, as I noticed that even "dynamic" games have immovable objects that will benefit from it, just like Depth Prepass is considered useful for fillrate-dependent scenes, that are common, yet specific thing).
About CPU spikes - I really don't know is frame-time got 0.5 ms because it was THAT expensive or it's just single-threaded? Because if it is, difference could be much lower. Even two cores can reduce this to 0.25 ms. And even so, do games aim nowadays for framerate higher than 144? Even so the frame budget is ~6,944 ms, so 0,5 ms takes 7% of CPU budget.
@BastiaanOlij it's idea from amateur (me), but can you rethink of applying static shadow buffer exclusively for StaticBody3D meshes? These are stated in documentation as fully immovable objects, only triggered by scripts. You can, like, completely simplify your system of checking on moving objects.
Updated: It should really be mentioned, that static shadows use case always crosses with usage of occlusion culling, because they both rely on static geometry. Because of the way how raster occlusion works in Godot, real CPU bottleneck is occlusion culling itself (yet it reduces GPU frame time).

@BastiaanOlij
Copy link
Contributor Author

@BastiaanOlij it's idea from amateur (me), but can you rethink of applying static shadow buffer exclusively for StaticBody3D meshes? These are stated in documentation as fully immovable objects, only triggered by scripts. You can, like, completely simplify your system of checking on moving objects.

I've been wanting to pick this back up for awhile, both me and @clayjohn discuss this functionality with some regularity during the rendering meetings and privately.

The problem is finding the time to work on this. I currently already have more on my plate than I'm comfortable with so I can't promise anything timing wise. It sadly isn't as simple as rebasing, it needs a fair bit of work to make it mergable.

RENDER_TIMESTAMP("Render Static Shadows");

// Cube shadows are rendered in their own way so we do them first.
for (uint32_t i = 0; i < static_cube_shadows.size(); i++) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this is the only explicit cycle in commit concers me (except the same cycle in render_forward_mobile.h). Could that be the CPU bottleneck that @Calinou has found through minimal reproduction project?
@BastiaanOlij @DarioSamo please investigate. Is it possible to parallelize such cycle? I am bypasser with ~0% programming experience and the fact we have nested loop through frame rendering makes me wonder if that's necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve shadow rendering performance by rendering static shadows and dynamic shadows separately for each light