-
-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.x] Shader goodies: async. compilation + caching #46330
Conversation
Can you use feature tags to give different values to the project settings? Alternatively, you could look at how batching can be toggled separately in the editor and running project (at least in 3.2.3). |
46475fc
to
6b12f1a
Compare
This is so cool! I need to find some time to give it a proper run but I love the solution. |
@@ -2430,7 +2430,11 @@ bool VisualServerScene::_render_reflection_probe_step(Instance *p_instance, int | |||
} | |||
|
|||
_prepare_scene(xform, cm, false, RID(), VSG::storage->reflection_probe_get_cull_mask(p_instance->base), p_instance->scenario->self, shadow_atlas, reflection_probe->instance); | |||
|
|||
bool forced_sync_backup = VSG::storage->is_forced_sync_shader_compile_enabled(); | |||
VSG::storage->set_forced_sync_shader_compile_enabled(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this left in from debugging? Or is there a reason that reflection probes always need force sync enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was intended because in the project this was initially written for reflection probes were UPDATE_ONCE
and so they only have one chance to capture the look of the real shaders.
However, now I realize that for general use this is not enough. Maybe it's just a matter of doing that unless it's UPDATE_ALWAYS
. Probably it will be more involved than that, but may be a good start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it would be enough to change true
to update_mode == UPDATE_MODE_ONCE.
A perfect solution would be to delay the capture until all shaders are compiled. But that is probably out of scope for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some early comments:
- This is incredible! Overall design looks great and so happy to have this PR. Looks like 3.2.5 is going to be an exciting release
- How difficult will it be to add support for particles shaders and canvas_item shaders? It looks like the functionality is already built into shader, so is it just a matter of exposing it in the material and in rasterizer_canvas.glsl? and the relevant places for partices? I'd like to support all shader types before merging
GLOBAL_DEF("rendering/gles3/shaders/max_concurrent_compiles", 4); | ||
GLOBAL_DEF("rendering/gles3/shaders/max_concurrent_compiles.mobile", 1); | ||
GLOBAL_DEF("rendering/gles3/shaders/simple_fallback_modulate", Color(1, 1, 1)); | ||
GLOBAL_DEF("rendering/gles3/shaders/force_no_render_fallback", false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We define all project settings in visual_server.cpp now. This way users can still see the gles3 settings when running in GLES2 mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'll fix it.
@@ -683,7 +683,6 @@ void EditorSettings::_load_defaults(Ref<ConfigFile> p_extra_config) { | |||
|
|||
_initial_set("project_manager/sorting_order", 0); | |||
hints["project_manager/sorting_order"] = PropertyInfo(Variant::INT, "project_manager/sorting_order", PROPERTY_HINT_ENUM, "Name,Path,Last Modified"); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can get rid of this I guess
config.program_binary_supported = GLAD_GL_ARB_get_program_binary; | ||
config.parallel_shader_compile_supported = GLAD_GL_ARB_parallel_shader_compile || GLAD_GL_KHR_parallel_shader_compile; | ||
#else | ||
config.program_binary_supported = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going to need a special case for WebGL as it never supports glProgramBinary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I guess that in the WebGL case we will only have the possibility of using the approach based on the parallel compile extension, with no fallback. Also, caching won't be possible at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately yeah. :(
if (!Engine::get_singleton()->is_editor_hint()) { | ||
ShaderGLES3::force_no_render_fallback = (bool)ProjectSettings::get_singleton()->get("rendering/gles3/shaders/force_no_render_fallback"); | ||
#ifdef DEBUG_ENABLED | ||
ShaderGLES3::force_use_fallbacks = (bool)ProjectSettings::get_singleton()->get("debug_force_use_fallbacks"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShaderGLES3::force_use_fallbacks = (bool)ProjectSettings::get_singleton()->get("debug_force_use_fallbacks"); | |
ShaderGLES3::force_use_fallbacks = (bool)ProjectSettings::get_singleton()->get("rendering/gles3/shaders/debug_force_use_fallbacks"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'll fix that soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is such a cool PR. Really like the approach with the fallback.
I've only tested on Windows and don't have a good project to stress test it but it looks like it's caching my materials and compiling them properly with all the settings on.
Doing a pass over the code I think Clayjohn already found more then I spotted, looks well structured and a sound approach. Hope people are able to test the other platforms.
Should fix #13954 |
I have one of the larger Godot games in development. I built this PR and tested my project on Win 10/64, GTX 1060 6B, Core i7 8750h, 16GB, SSD. It stutters just as severely. The only difference is that it visually displays the stages of shader compilation as it stutters, rather than showing me fully textured materials, then stuttering like stock godot. So the end result is actually worse. I loaded my project, enabled I do get the following warning:
Here is what I see when I load the game: Every other scene does the same thing, visually building the materials over about 3 seconds, AND the performance still lags. It's not just a visual effect, but stuttering still consumes the engine and player controls as it compiles the shaders, with no profiling information available or clue as to WTF it's doing. The engine just pauses, resumes, pauses, resumes... What I really want is an option to precompile all shaders from code that I can stick in _init() or _ready(). I already load all scenes in the game to fix other performance issues. There's no reason at all why the shaders should wait until they show up on screen to be compiled. What the engine should really do is compile all shaders automatically before _ready() is called, or in the editor like UE. Can you provide functionality to at least precompile on demand from code please? I'd be happy to loop over all materials and add a .compile() before giving control to the player. If you want access to my repo so you can test directly, dm me your email address on twitter. Edit: Removed healthbars issue - present in stock 3.2.4rc Clickable images from Out of the Ashes (follow @TokisanGames): |
I don't know if it affects this PR, but a possible snag afaik when you tell gl to compile a shader, some drivers don't actually compile the shader, they defer a bunch of work until it is first used. So if this is the case, shifting it to a different thread may not help as much as hoped. |
Doesn't seem to work on iOS. But seems to be working fine on macOS.
The shader source are also reported, but the log is too big and might be hard to read: https://gist.github.com/naithar/a39c0585bdb3dab1cc88c30b6ee04afa
Last two combinations (with async enabled) result in cube not being rendered. |
@tinmanjuggernaut I have a feeling the stuttering and load times are from loading the models and textures rather than shader compilation. You mentioned that most of your shaders are basic SpatialMaterials. SpatialMaterial shaders are compiled once and then shared between all SpatialMaterials. Additionally, they are very simple and compile nearly instantly (unless you are on a very old device). To test if shaders really are the issue, try loading your scene with the camera pointing straight up with no objects in its field of view. If it still stutters and takes 10 seconds to load, the problem isn't shader compiling. If it loads quickly and then stutters once you move the camera to view the scene + character, then the issue is shader compiling. |
This is really cool! I've been dying for something like this to use for my project! I gave it some testing on 64-bit Ubuntu Linux 16.04. The project is Cassette Beasts. I can give you access to the project under NDA if it would help--DM me on twitter (@tccoxon) or email me (tom@bytten-studio.com). Some things I noticed:
I still get some stutters when I run with async compilation enabled, but I haven't eliminated other causes yet. @tinmanjuggernaut The stuttering could be as @clayjohn says. As for the visual effect, you can render materials while on a loading screen to cause them to compile. There are some intricacies, e.g. you need to make sure certain shadow and environment settings are the same as what you're going to use in the scene. And you also need to know if a material is going to be used with a multimesh or just on a solo mesh, since all these factors lead to different shaders being generated. Happy to chat with you about what I've done in twitter DMs (@tccoxon) if you like. |
@clayjohn Thanks for the ideas. Shader re-compiling is definitely happening live and there's plenty of time to load every resource in the game. Here are more details:
@tcoxon Thanks. Discord may be easier TinmanJuggernaut#7375 (@RandomShaper or @clayjohn, feel free to reach me here as well). We do have different lighting per scene. So even basic SpatialMaterials need to be recompiled for different lighting conditions? We have a loading screen and I'm more than happy to manually initiate compiling during this time if it was exposed in the engine, or if it is, that I know how to do it. |
@Calinou Thanks, but that setting makes no difference for me. Mine stutters windowed or full screen, with or without that setting. The ANGLE PR (#44845) looks interesting. Also I do have an Optimus (re: godotengine/godot-proposals#1725), though I only use the nvidia card. Is stuttering in Godot limited to Optimus? I haven't experienced it in any other application. Currently, I'm using @tcoxon 's suggestion of applying every material in the scene to a plane and waiting for it to render, and I simultaneously rotate the camera 360 degrees. Neither is adequate alone. I'm still testing, but this seems to have addressed 99% of the stuttering, even in stock Godot. Adding this PR means if Godot decides to recompile one again, hopefully it will be faster and with a shorter lag. However there's still an issue with visual artifacts when it does recompile. I just observed a mesh fully instanced, textured and animating, recompile its material and flash to black & white before coming back. In my earlier tests above I noticed this quite a bit on my main character's hair or other objects that already had a material in the current lighting, then it recompiles, lags, flashes b&w, before coming back exactly as it was. |
I will be away from PC for at least two weeks. When I'm back I'll do my best to refine this PR as soon (and as well) as possible. Just FYI. |
I’ve made a simple benchmark project to test out execution times with your implementation and Godot 3.3.1. Measured times in seconds: So in my simple test it showed no substantial time gain. |
@Leocesar3D, I think your test is well formulated. However, depending on the specifics of the materials they may trigger or not the fast path. It'd be interesting to run it with caching disabled and also doing multiple rounds over the set of materials. Please remember that I still have to do (when I can get some time for it) a number of improvements, additional tests and adding more flexibility because the current implementation may be deciding too much. In any case, thank you for your feedback. I hope I can eventually make this work as expected. |
Also keep in mind that to get accurate measurements, you'd have to build both the PR and the last commit before that PR with the same toolchains and options. Comparing official builds of 3.3.1-stable with a custom-made build of this PR would be tricky because:
|
Do wonder if this PR was abandoned prior to Godot 4.0, was honestly looking forward to it, specially since GL3 won't be a thing until Godot 4.1. |
Since 3.4 is nearing release, I'm afraid it's too late to merge this for 3.4. There are still plans to finish this PR to get it in 3.5 hopefully, but I can't make any guarantees. |
I'm looking forward to finish it, but lately I'm just not having enough time to work on it. |
Dang, I was hoping this would be in the latest 3.x. That's one of the reasons I merged. I guess I confused it with the 4.x change. Might have to grab this manually. What's left to do? Merge it with the latest 3.x changes? |
6b12f1a
to
23f9895
Compare
This is closed in favor of the new, much better #53411. Those who were interested in this, please check out the new one. I'm keeping this as an archived one for potential future reference. |
Superseded by #53411
The main goal of this PR is to reduce stalling in games.
Current limitations:
DISCLAIMER: This implementation has been used in a project where it actually helped reducing stalling caused by shader compilation. However, this can be considered experimental and some testing would be very welcome. Also, the code itself may be better in how some values are made available to the different pieces of the renderer. Ideas welcome!
Shader caching
As long as the target platform supports the program binary GL extension, this is just enable and forget.
Some remarks:
Asynchronous compilation of shaders
It will work if enabled and supported by the GL driver. If native parallel compilation is supported, that's used, which is the most efficient. Otherwise, asynchronicity is achieved via a secondary GL context (and another thread) that sends the compiled shader back to the main one in its binary form, which means the program binary extension must be supported. If both fail, async. compilation is effectively disabled.
Three fallback modes are added to both manually created shaders (either codey or visual) and
SpatialMaterial
s: none, simple and no render. Please check the diff where these are explained in the built-in documentation.The default mode is simple. You can explicitly set a more conservative mode for any shader/material.
The simple fallback is a shadeless shader that is able to transfer to itself the following stuff from the original shader:
albedo
oralbedo_color
.uv1_scale
anduv1_offset
.hint_*_albedo
; else, the first 2D texture used in the material, according to the order of uniforms.Please also see the diff for an explanation of the different project settings.
This code is generously donated by IMVU.