-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split rendering of shadowmaps into a static and dynamic pass #77683
base: master
Are you sure you want to change the base?
Split rendering of shadowmaps into a static and dynamic pass #77683
Conversation
This resolves godotengine/godot-proposals#4635, and which is one part of godotengine/godot-proposals#6948 |
And this issue partially i think. #73411 |
Just food for thought, if marking objects as dynamic / static, could it be an idea to future proof this and have a generic flag rather than labelling it specifically for shadows? I (NOTE: This is badly remembered second-hand info from @Calinou from a few years ago so could be "lost in translation" 😄 . This could have been referring to a particular case, and besides, Godot has evolved a lot over the years. If I remember right the idea was to keep Godot as easy to use as possible and with as few options as possible to keep things simple. This is understandable - as time goes on there are inevitably more and more settings - paralysis of choice is a real problem.) (An alternative which I think @Calinou has suggested in the past is to automatically detect static objects, but that could be a bit cumbersome in some cases.) It might also be an idea to have the static / dynamic flag as inherited / static / dynamic, with default inherited, that way users could easily mark a branch as static without marking every mesh. |
I think this new flag makes sense to be called a generic rendering flag, but there's definitely some demand for keeping the GI and general rendeering flags separate (cc @EMBYRDEV). How do other engines handle this in comparison? |
@lawnjelly indeed @clayjohn was discussing with me the merits of introducing a flag for broader use. I kept it straight forward for now just to get the basics working and tested but we can see where we want to take this. I'm interested to know what @reduz suggestion was to not have the flag. We can infer what objects are dynamic to some extend, especially if they are children of physics objects, but there are a lot of situations where we won't know if an object will move until code moves it. The overhead of keeping track of which objects are rendered to a light and which have changed from static to dynamic would be intense not to mention we don't capture situations where an object is just moved one time or as part of a level change. |
Unity has a single per-object static checkbox, that actually sets a bunch of flags for different systems. You can also set it per-system. When changing these values, the user is asked whether to assign it to the current object only, or to its children as well. Personally, I think it's a nice workflow. For basic uses, it's a simple "toggle this for free optimization" check, while also offering advanced options. |
Right now I'm going in the opposite direction, everything is static by default and you need to mark dynamic objects, seeing 90% of your scene will be static, this way the amount of work is lessened. I do believe the answer lies with good IDE functions, configuration warnings if your a child of a physics body and dynamic is not selected, indeed the ability to make a whole tree as dynamic/static. Maybe this should work like our pause function, where the default mode is "inherited from parent" and we go down the tree. If we come all the way down to the root level it's static, if we encounter a physics body it's dynamic, if we encounter a parent that has static of dynamic set, we use that? |
I was just about to suggest this exact thing. With this, it would be pretty easy to create scene configurations that "just work". For example making basic Statics / Dynamics nodes near the root. You would then have to enable the dynamic mode only on a single node and the children would then automatically inherit that (unless they specify otherwise). |
What happen when using lightmaps and what about harirchy? For example in a complex scene you have to go to all meshes and set them static even if a parent node is set to static? Could this be done not in a mesh but in a 3DNode and children inherit from it? |
I see no reason why not |
Not sure, i think it should be automatic, to avoid the user having to click one by one the dynamic mesh. |
Yes, but this feature isn't slated to be merged for 4.1 due to feature freeze. |
I agree with of your comment, except for this part. If I place a 3d object on scene and move it by code, the behaviour should be correct by default. as I see it, performance is "optional", correctness is not. |
The problem though is that you're introducing a hell of a lot of workload, the vast majority of objects in a scene will be static. I think the idea to inherit the behaviour has a lot of merit especially when some of this is automatic (i.e. a mesh instance as a child of a rigid body becomes dynamic automatically) but I was originally planning on only doing this for geometry nodes. |
Any input on this? |
Right now nothing much to say because we're separating the settings for GI/lightmaps and normal lights. To the best of my knowledge, when you mark a light as having to be baked, that light no longer gets evaluated as the light information is now coming from the lightmap or from VoxelGI or SDFGI. So that light in effect is only static at that point in time. If that is the case, only objects marked as dynamic for GI will be handled by the normal shadowmap system which will then divide that remainder in what needs to be rendered into the static shadow map (hopefully nothing) and what goes into the dynamic shadow map. There is no additional overhead if nothing is rendered to the static shadowmap, the logic is able to skip over that nicely. That is a bigger discussion however and comes down to whether it makes sense to have a separate setting for GI and for shadowmaps, or if we should have a single setting that just markes a node as static or dynamic. |
The default could always be different for things created inside Godot and imported scenes. I doubt the performance really matters in level blockouts and I'd rather have my crappy tween driven cube doors and other placeholders work without any extra effort. Would also mean you'd only have to worry about the value during import process which I think would be the natural time and place to set the value anyway (good spot to do any fancy auto-detection too). |
When using LightmapGI, both objects with static and dynamic GI cast shadows (and therefore incur shadow draw calls). In a scenario with fully baked lights, this allows dynamic objects to receive accurate shadows from the world. This however has more performance overhead compared to skipping light rendering entirely for static lightmapped lights. Eventually, we'll need to add a 4th "Fully Static" GI mode that skips light rendering entirely (except when baking the lightmap) and bakes the direct light in the dynamic object capture system (lightmap probes). This could also be supported for VoxelGI and SDFGI, but it's not as important there due to the light data's density generally being too low. The current way of rendering static lights in LightmapGI should remain available as an option though, since it allows for accurate shadow casting onto dynamic objects (which can be valuable in many cases but with better performance than baking indirect light only). |
9e87a3a
to
fd258a8
Compare
Ok omni and spot lights now render their shadow maps properly with the dynamic and static split. Cubemap omni lights still have their two step approach but I think I found a good compromise here in the approach. |
0b98f39
to
f78845a
Compare
Still trying to find an issue with the dynamic cubemap being copied in the wrong order.. |
As I mentioned before, I think to merge this PR, it has to be shown that it really can improve performance vs simply dropping dynamic objects in complex scenes. My experience doing these things is that the bandwidth burden of copying and bookkeeping a static shadow map can be greater or equal to the savings of simply not rendering those objects. This is because the lower end devices that are benefited with not drawing the objects (IGP in general) also have limited memory bandwidth to cope with copying such large shadow buffers. On top of this, on mobile, this behavior is absolutely abysmal in performance because it means a framebuffer load at the beginning of the shadow rendering, which incurs in massively more bandwidth used than just rendering the objects for TBDR architectures. Additionally to this, in general in modern desktop GPUs, while shadow maps are being rendered, compute and other activities can be done at the same time. This may break this parallelism somehow (which was working at some point, would need to be checked again now). To sum up, In general:
|
I don´t know if this can work as a performance test, but from my igpu point, seem´s like the dynamic and static flag don´t seem to affect performance, though my igpu has an amd radeon vega 3 driver,static meshes(not moving) don´t gain any performance when it on static flag. Edit for context: |
sure as it isn't re-rendering the shadow maps, have some dynamic object that moves around the lights, if you got a heavy scene it'd kill performance with no static lights. |
Ok got another scene this time with more cubes. and a character moving with dynamic shadows and a omni light on it and it the shadow2test one the one with more fps is the static. shadows |
Good, so it benefits potato IGPUs.. Will likely translate even better to more complex scenes on higher end hardware, as long as it doesn't run into issues with parallelism, which afaik might be an issue with advanced features such as SDFGI which run in parallel with shadow rendering if I remember correctly... |
Cool benchmark @Saul2022 , the one I was working on was far more elaborate but I might add the things I wanted to add to yours. @reduz I agree with a lot of what you state here. For mobile the copy of the static map is very expensive and may outweigh any benefit. Indeed this seems mostly a technique that applies to high end. The other thing that I started to realise more and more as I was working on a benchmark project is that most lights don't cover a lot of objects to begin with, you may only win performance in very complex environments where lights cover a lot of ground. We need a far more stable algorithm to ensure lights use the same quadrant of the atlas as long as possible. Right now they shift too easily even when its not required. That all said, the logic to autodetect which objects are dynamic and which are static may remain useful. Especially if we go to an approach where we rendering static lights for our stepped directional lights. The problem I have at the moment is that once the list is split to static and dynamic objects, it becomes harder to combine them. |
Maybe what could be done is first merge this pr and then later on for mobile us igpu, make what juan said on not re render shadows at a distance, though that parameter could be a customized sething in the shadow properties or at the project sethings, this could get the best of both worlds, and help the stepped directional shadow pr. |
mobile means phone, not laptop... they use vastly different rendering architectures and philosophies, due to vastly different constraints |
I know, what i mean is applying what juan said for the specs he mentioned(as he mentioned that method for low end and mobile) on a follow up pr, and merge this one for the stepped directional shadows pr. |
OS: Fedora 38 In @Saul2022's MRP, this PR unfortunately causes performance regressions on both a debug editor build and a release export template build. This is due to a CPU bottleneck as evidenced by MangoHud readouts, as the GPU sees much lower utilization:
Other performance metrics:
This may not be as much of an issue in scenes with more complex geometry, but it's something to keep in mind despite the final framerate still being easily above 500 FPS. PS: The capsule's shadow looks very slightly different with this PR too. This may hint at a sampling/filtering issue, though it's not critical. |
protected: | ||
public: | ||
DebugEffects(); | ||
~DebugEffects(); | ||
|
||
void draw_shadow_frustum(RID p_light, const Projection &p_cam_projection, const Transform3D &p_cam_transform, RID p_dest_fb, const Rect2 p_rect); | ||
void draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 p_rect); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 p_rect); | |
void draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 &p_rect); |
I'd pass by reference here
@Calinou thanks for testing, that kind of matches up with some of my own testing which is why I've let this sit for a bit and been focussing on other parts. At the moment I am unsure of whether this approach simply doesn't provide the benefits we hoped for (as Reduz has indicated in the past), or whether there are faults in the implementation. I do plan to return to this after some of my other workload is behind me and take a fresh look at it. If all else fails, I may bring this back in feature set to match suggestions made by Reduz and purely focus on removing dynamic objects from shadow maps, we'll see. I'm going to put this PR back on draft until we have time to sort this out. |
I'd recommend opening a separate PR if you do this, so that this particular work isn't lost if we decide to bring it back in the future. |
Simply removing dynamic objects from shadow maps can easily be done by the user, by turning "Casts shadows" to off, and the only thing that would need to be done is stepped directional shadow updating at 1 cascade a frame, which would be a slight tweak of the other PR. However I do have to point out, that the GPU used for Testing is Insanely powerful, great for triangle pushing and the scene is still very simple. This should be tested in a scene where the GPU is struggling to keep up with rendering all the Static Shadows, not where it's sleeping while doing so, else you're guaranteed to not see any benefits from this and just the overhead of it. |
@mrjustaguy very true, though on a fast GPU you would expect only a marginal cost or marginal improvement. In my testing I had scenarios where on a fast GPU split shadows were 30% slower. The problem I had is that I couldn't identify why:
I'm very positive that I can figure it out, but at the time I had been working nonstop on this for 2 months and frankly it started to burn me out. So I needed to take a break from it and come back when I can review it with a fresh perspective. @Calinou indeed, when I get back to things I'll probably split of bits that I know are good and useful off into separate PRs, I do NOT want to loose the work I've done here. For all I know there is just a mistake in the approach and once we figure that out, all the work here can be fully salvaged. |
After some time Working on my Project I have found one potentially massive application for this, though for Directional Shadows only with Stepped Cascade updating. Heavy Forest Areas absolutely murder DS Rendering performance especially at higher Shadowmap Resolutions, due to all the Alpha Foliage it's rendering.. An easy way to simulate this behavior for testing is to put on a noise texture on a massive quad, use alpha scissoring, and duping it a couple of dozen layers one on top of the other. |
@@ -1253,6 +1258,15 @@ void RendererSceneCull::instance_geometry_set_cast_shadows_setting(RID p_instanc | |||
_instance_queue_update(instance, false, true); | |||
} | |||
|
|||
void RendererSceneCull::instance_geometry_set_shadow_mode(RID p_instance, RS::ShadowDynamicMode p_shadow_dynamic_mode) { | |||
Instance *instance = instance_owner.get_or_null(p_instance); | |||
ERR_FAIL_COND(!instance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ERR_FAIL_COND(!instance); | |
ERR_FAIL_NULL(instance); |
</constant> | ||
<constant name="SHADOW_RENDER_STATIC_MAP" value="1" enum="ShadowRenderSetting"> | ||
This instance is rendered to static shadow maps. This speeds up shadow map rendering for objects that do not move. | ||
Note: if the objects position or orientation changes, its shadow will not be updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: if the objects position or orientation changes, its shadow will not be updated. | |
[b]Note:[/b] If the objects position or orientation changes, its shadow will not be updated. |
@@ -326,3 +338,27 @@ void DebugEffects::draw_shadow_frustum(RID p_light, const Projection &p_cam_proj | |||
} | |||
} | |||
} | |||
|
|||
void DebugEffects::draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 p_rect) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void DebugEffects::draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 p_rect) { | |
void DebugEffects::draw_combined_shadow_map(RID p_static_texture, RID p_dynamic_texture, RID p_dest_fb, const Rect2 &p_rect) { |
This could cause problems when there are objects that only move with interacted with, so it should probably be discouraged. Maybe it should detect from the GI mode instead? It's pretty surefire in my opinion. If a mesh's GI mode is static, you know it doesn't move, so you can safely give it static shadows. |
I know it's basically begging, but could this draft be accounted for 4.4 release? First 4.4 snapshot looked very cool with its rendering performance improvements, so final release might be much cooler in comparison with 4.3. |
This PR needs some deep investigation to figure out exactly why it isn't performing as we expect. Because this feature comes with a lot of tradeoffs (it adds significant complexity to users in exchange for better performance), it needs to have really good performance to be worth merging. We just aren't able to invest enough time into it right now to get it where we want it to be. It is certainly something we want to see finished eventually, but we have other optimizations that are higher priority right now |
@clayjohn even though you claim about other priorities (that's valid reason), I would want to take some words in defense of investigating this PR further. |
I've been wanting to pick this back up for awhile, both me and @clayjohn discuss this functionality with some regularity during the rendering meetings and privately. The problem is finding the time to work on this. I currently already have more on my plate than I'm comfortable with so I can't promise anything timing wise. It sadly isn't as simple as rebasing, it needs a fair bit of work to make it mergable. |
RENDER_TIMESTAMP("Render Static Shadows"); | ||
|
||
// Cube shadows are rendered in their own way so we do them first. | ||
for (uint32_t i = 0; i < static_cube_shadows.size(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that this is the only explicit cycle in commit concers me (except the same cycle in render_forward_mobile.h). Could that be the CPU bottleneck that @Calinou has found through minimal reproduction project?
@BastiaanOlij @DarioSamo please investigate. Is it possible to parallelize such cycle? I am bypasser with ~0% programming experience and the fact we have nested loop through frame rendering makes me wonder if that's necessary.
This PR changes the shadow map rendering logic so static objects (objects marked as not moving) are rendered to a static shadow map. This static shadow map is only updated if the light changes or if the light is assigned a new region in the shadow map.
This functionality can be enabled through a new project settings:
Each frame the static shadow map is copied and we render dynamic objects (objects marked as potentially moving) into this shadow map. This is this shadow map that is used to render shadows.
This removes a lot of overhead in rendering shadow maps when there are many static objects.
Geometry can be marked as static or dynamic to determine whether they need to be rendered to shadow maps every frame (if applicable). There is an automatic mode that, with a little overhead, figures out which objects move after the static shadow map is rendered and marks those objects as dynamic automatically. After a few frames dynamic objects are identified and the efficiency is applied.
The property name is currently named
Render shadow
.See the discussion in the comments about possible changes with an approach for using an "inherit from parent" approach though the automatic mode seems to work well enough.
Note if animations are linked to a mesh, we automatically recognize it as dynamic.
Our debug display will overlap the static and dynamic results.
Currently only the Vulkan renderer supports this feature.
Todos:
y-flip the debug image, everything is upside down atm
Make this work for directional shadows (this may become a separate PR or part of Stepped update of directional shadow maps #76291)
Production edit: This closes Improve shadow rendering performance by rendering static shadows and dynamic shadows separately for each light godot-proposals#4635.