-
-
Notifications
You must be signed in to change notification settings - Fork 22.2k
Fix regression around OpenGL swapchain optimization for OpenXR #94894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix regression around OpenGL swapchain optimization for OpenXR #94894
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
d348d97
to
1eb0039
Compare
Just rebased it, didn't realise my master was a few days old.. Shouldn't effect tests. Decasis, I'll try and run your test app, might be a different issue there. |
Note that my MRP was made specifically for the Quest 2. It won't show the performance difference on a Quest 3 since it has better specs. Also, the OpenXR vendors should be updated to 3.0.0 beta 3, or at least that's what I used for this test. |
On the pico 4 my monkey head test (more monkeys, more better) gives 360 monkeys in 4.3dev1, 300 in 4.3.dev2, and 340 on this PR. So it is definitely better but maybe there's still some performance gain to be had. It's pretty consistent run to run so I don't think it is just measurement error, but it is possible. |
Ah, that explains why for me it reached normal performance. Time to charge my Quest 2 |
Ok that would suggest we're dealing with two separate issues.. |
@BastiaanOlij I can confirm, it seems to be a separate issue in my MRP... I just created another, much simpler project with built-in meshes (Spheres) and no animations or anything fancy and get 72fps in 4.3-dev1, ~56fps in 4.3-dev2 and 72 in this PR So yeah, I can confirm this PR fixes the original issue. Edit: for reference, this is the MRP I used for this test: https://drive.google.com/file/d/1trhfPlcXqwH8rJsLyTvQOaPxO6zyWw8p/view?usp=sharing I just changed the OpenXR vendors depending on what version I was testing on Edit 2: also note that, my original project went through dozens of versions when I was bisecting. It is possible that something broke it in some way that creates issues in newer versions. |
@decacis yeah it's a pain bisecting around those builds because of the changes in deployment on Android. It's really easy to make a mistake. Animations are also subject to whether you're doing a release or dev build as we have less compiler optimisations turned on in dev builds. |
I did some testing using the MRP from #94856 on Quest 3. It required some minor changes to work on Quest 3, notably adjusting the export settings for Meta rather than Pico, and setting My results (more monkey heads is better):
@DanielKinsman mentioned above that in his tests on Pico, he wasn't able to get up to the same number with this PR as Godot 4.2.2, but in my tests I landed on exactly the same number. Anyway, this PR seems to fix the issue in my testing, and the code looks good to me (although, I am no rendering expert). |
Thanks! |
I can't confirm that at the moment. It's probably not that easy to test it with the Monkey project. I also noticed an improvement in certain scenes. But I also have scenes that show no improvement at all. Still 73 vs 45 FPS. Obviously there are various factors that influence the result. I will investigate this further and try to isolate the problem. What MSAA settings did the other testers use? |
@RumarioVR Do you have MSAA enabled? If so, that might explain it. MSAA wasn't added until 4.3 dev1. So if your project has MSAA enabled, it will definitely be slower in 4.3 than in 4.2 |
@RumarioVR you can fix the issue with the textures by including this PR: #94902 MSAAx2 is actually pretty much costless on our current Android implementation if the required GL extensions are available (which they are on XR2 chipsets/drivers) AND assuming no features have been enabled that require multiple render passes and thus reading render target data back into tile memory (which with MSAA doubles the performance penalty for every MSAA level (so double for x2, quadruple for x4, etc.).
Combine any of those features with MSAA and the performance degradation becomes that much worse. On desktop MSAA will always have a performance penalty as the unique scenario where it is nearly costless revolves around special optimisations possible in TBDR architecture under tight restrictions (just on desktop the performance penalty is tiny compared to mobile). @clayjohn we need to think of a way to better inform users when they enable performance degrading options, especially on mobile.
|
After a long discussion with @clayjohn we came to the conclusion that #84244 was the cause of our performance regression on OpenGL due to a misunderstanding on my part between a crucial difference between Vulkan and OpenGL.
This caused
glFinish
to be called in a scenario that wasn't needed.Due to the confusion caused by original naming of methods I've kept
end_frame
in its new place, giving us the opportunity to do the proper swapchain swap on Vulkan, and proper timing check in OpenGL instead of the workaround that was in place before.end_viewport
has been renamed togl_end_frame
to better communicate that this is a call specific to OpenGL and is only called after output has been blitted to our swapchain and it has to be handled.What we are still unsure of, seeing how things were before #84244 is whether the logic is correct when
p_swap_buffers
is false, which happens when a user calledforce_draw
.It is possible that the correct code in the XR branch should be:
However, calling
force_draw
when XR is active is probably fairly dangerous to begin with as this would result in an extra call toxrWaitFrame
with all sorts of timing things going wrong.All in all, it would be good to test this PR against a normal application that uses
force_draw
to see if the functionality works as expected. I do not have a valid example project for this.