Skip to content

Fix regression around OpenGL swapchain optimization for OpenXR #94894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2024

Conversation

BastiaanOlij
Copy link
Contributor

After a long discussion with @clayjohn we came to the conclusion that #84244 was the cause of our performance regression on OpenGL due to a misunderstanding on my part between a crucial difference between Vulkan and OpenGL.
This caused glFinish to be called in a scenario that wasn't needed.

Due to the confusion caused by original naming of methods I've kept end_frame in its new place, giving us the opportunity to do the proper swapchain swap on Vulkan, and proper timing check in OpenGL instead of the workaround that was in place before.

end_viewport has been renamed to gl_end_frame to better communicate that this is a call specific to OpenGL and is only called after output has been blitted to our swapchain and it has to be handled.

What we are still unsure of, seeing how things were before #84244 is whether the logic is correct when p_swap_buffers is false, which happens when a user called force_draw.

It is possible that the correct code in the XR branch should be:

if (blits.size() > 0 || !p_swap_buffers) {
	RSG::rasterizer->gl_end_frame(p_swap_buffers);
}

However, calling force_draw when XR is active is probably fairly dangerous to begin with as this would result in an extra call to xrWaitFrame with all sorts of timing things going wrong.

All in all, it would be good to test this PR against a normal application that uses force_draw to see if the functionality works as expected. I do not have a valid example project for this.

@decacis

This comment was marked as outdated.

@BastiaanOlij BastiaanOlij force-pushed the fix_regression_84244 branch from d348d97 to 1eb0039 Compare July 29, 2024 02:47
@BastiaanOlij
Copy link
Contributor Author

Just rebased it, didn't realise my master was a few days old.. Shouldn't effect tests.

Decasis, I'll try and run your test app, might be a different issue there.

@decacis
Copy link
Contributor

decacis commented Jul 29, 2024

Note that my MRP was made specifically for the Quest 2. It won't show the performance difference on a Quest 3 since it has better specs. Also, the OpenXR vendors should be updated to 3.0.0 beta 3, or at least that's what I used for this test.

@DanielKinsman
Copy link
Contributor

On the pico 4 my monkey head test (more monkeys, more better) gives 360 monkeys in 4.3dev1, 300 in 4.3.dev2, and 340 on this PR. So it is definitely better but maybe there's still some performance gain to be had.

It's pretty consistent run to run so I don't think it is just measurement error, but it is possible.

@BastiaanOlij
Copy link
Contributor Author

Note that my MRP was made specifically for the Quest 2. It won't show the performance difference on a Quest 3 since it has better specs. Also, the OpenXR vendors should be updated to 3.0.0 beta 3, or at least that's what I used for this test.

Ah, that explains why for me it reached normal performance. Time to charge my Quest 2

@BastiaanOlij
Copy link
Contributor Author

On the pico 4 my monkey head test (more monkeys, more better) gives 360 monkeys in 4.3dev1, 300 in 4.3.dev2, and 340 on this PR. So it is definitely better but maybe there's still some performance gain to be had.

It's pretty consistent run to run so I don't think it is just measurement error, but it is possible.

Ok that would suggest we're dealing with two separate issues..

@decacis
Copy link
Contributor

decacis commented Jul 29, 2024

@BastiaanOlij I can confirm, it seems to be a separate issue in my MRP... I just created another, much simpler project with built-in meshes (Spheres) and no animations or anything fancy and get 72fps in 4.3-dev1, ~56fps in 4.3-dev2 and 72 in this PR

So yeah, I can confirm this PR fixes the original issue.

Edit: for reference, this is the MRP I used for this test: https://drive.google.com/file/d/1trhfPlcXqwH8rJsLyTvQOaPxO6zyWw8p/view?usp=sharing I just changed the OpenXR vendors depending on what version I was testing on

Edit 2: also note that, my original project went through dozens of versions when I was bisecting. It is possible that something broke it in some way that creates issues in newer versions.

@BastiaanOlij
Copy link
Contributor Author

@decacis yeah it's a pain bisecting around those builds because of the changes in deployment on Android. It's really easy to make a mistake.

Animations are also subject to whether you're doing a release or dev build as we have less compiler optimisations turned on in dev builds.

@dsnopek
Copy link
Contributor

dsnopek commented Jul 29, 2024

I did some testing using the MRP from #94856 on Quest 3. It required some minor changes to work on Quest 3, notably adjusting the export settings for Meta rather than Pico, and setting xr_interface.display_refresh_rate = 90 in _ready() because Quest defaults to 72.

My results (more monkey heads is better):

Version Monkey Heads
Godot 4.2.2 530
Godot 4.3-rc1 390
This PR 530

@DanielKinsman mentioned above that in his tests on Pico, he wasn't able to get up to the same number with this PR as Godot 4.2.2, but in my tests I landed on exactly the same number.

Anyway, this PR seems to fix the issue in my testing, and the code looks good to me (although, I am no rendering expert).

@akien-mga akien-mga merged commit 0e9caa2 into godotengine:master Jul 29, 2024
18 checks passed
@akien-mga
Copy link
Member

Thanks!

@RumarioVR
Copy link

I can't confirm that at the moment. It's probably not that easy to test it with the Monkey project. I also noticed an improvement in certain scenes. But I also have scenes that show no improvement at all. Still 73 vs 45 FPS. Obviously there are various factors that influence the result. I will investigate this further and try to isolate the problem. What MSAA settings did the other testers use?

@clayjohn
Copy link
Member

@RumarioVR Do you have MSAA enabled? If so, that might explain it. MSAA wasn't added until 4.3 dev1. So if your project has MSAA enabled, it will definitely be slower in 4.3 than in 4.2

@BastiaanOlij BastiaanOlij deleted the fix_regression_84244 branch July 29, 2024 23:54
@BastiaanOlij
Copy link
Contributor Author

BastiaanOlij commented Jul 30, 2024

@RumarioVR you can fix the issue with the textures by including this PR: #94902
That should allow you to test without introducing new variables.

MSAAx2 is actually pretty much costless on our current Android implementation if the required GL extensions are available (which they are on XR2 chipsets/drivers) AND assuming no features have been enabled that require multiple render passes and thus reading render target data back into tile memory (which with MSAA doubles the performance penalty for every MSAA level (so double for x2, quadruple for x4, etc.).
Features that seriously impact performance are:

  • Glow/bloom
  • Reading from screen or depth texture
  • Scaling (mostly effects XR where it disables foveated rendering, so what you win by rendering less you loose by rendering less optimally)*
  • Color corrections that require our intermediate buffer to be used

Combine any of those features with MSAA and the performance degradation becomes that much worse.

On desktop MSAA will always have a performance penalty as the unique scenario where it is nearly costless revolves around special optimisations possible in TBDR architecture under tight restrictions (just on desktop the performance penalty is tiny compared to mobile).

@clayjohn we need to think of a way to better inform users when they enable performance degrading options, especially on mobile.

  • If you do want to use scaling to improve performance in XR, you can lower the render resolution of the XR viewport itself by changing the render_target_size_multiplier and the XR runtime will handle the upscaling. But in order for this to look good you have to use Composition layers to make text in menus readable.

@akien-mga akien-mga changed the title Fix regression around OpenGL swapchain optimisation for OpenXR Fix regression around OpenGL swapchain optimization for OpenXR Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Very Bad
7 participants