-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure shadowmap rendering in mobile renderer #76872
base: master
Are you sure you want to change the base?
Restructure shadowmap rendering in mobile renderer #76872
Conversation
426e5dc
to
92f2545
Compare
Ok, lots of good feedback after talking with some of the GPU guys at ARM and I already managed to update a few things. So a few things for prosperity. Barriers But on mobile the TBDR architecture results in all vertices being processed by the vertex shader, and then having rastering happen per tile. Our I think there are a number of other processes that will benefit from more targeted barriers. What I've done is introduce a When rendering our shadowmap for our cubemap we'll have the following setup:
Running the mobile renderer on desktop probably won't have much use of this split either, but here as it's not TBDR anyway it probably won't make much difference. Uniform buffers But we're still creating and updating uniform buffers and some of these are pretty large (like our light buffers) and update each frame. This means a lot of wasted bandwidth. On mobile GPUs we can instead make uniform buffers map to our own data structures and use the source data. This does mean that we need to keep that source data around as we often destroy our buffers, and we need to make sure we don't start overwriting data if we're rendering multiple viewports and things like that. Obviously this needs to be optional logic, detecting if we can map data or if we must load data into GPU memory but if we design the mobile renderer with unified memory in mind, it just means the copy we would otherwise have will be introduced on dedicated GPUs. RenderAreas Strangely it seems that on desktop the opposite it true. For now I've added a boolean that for testing I've set to true and that makes it use renderAreas (the code for this was already commented out with the remark about this being faster on desktop) but further testing and switching is required. Cubemap Shadows That's a lot for mobile and we should investigate alternatives that can be directly rendered into the shadow atlas. |
Dual parabolid mode is still supported for omni lights in 4.0, but the default is cubemaps since 3.0. This property is set on a per-light basis and is the same on desktop and mobile. That said, dual parabolid shadows suffer from lots of distortion if using unsubdividied meshes. Maybe look into tetrahedron shadows (4 faces), which don't suffer from as much distortion but should be faster to render than cubemaps. Relevant quote from https://github.com/Calinou/tesseract-renderer-design (which only targets desktop hardware, so I think tetrahedral is still worth trying for mobile):
|
@Calinou That quote is very interesting. Bastiaan and I were discussing comparing tetrahedral and octahedral shadow maps. Octahedral requires rendering to 8 faces, but the quality is comparable to using cubemaps and the texture lookup is much better than any of the other options |
This PR attempts to simplify shadow rendering for the mobile renderer as we're not trying to run things in parallel with GI.
Also trying out a few performance improvements recommended.
So far this is not having the desired result yet so lots to be done yet.