Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r_waylandcompat should be deprecated in favor of EGL_EXT_present_opaque #426

Closed
Yamagi opened this issue Nov 22, 2021 · 11 comments
Closed

Comments

@Yamagi
Copy link
Contributor

Yamagi commented Nov 22, 2021

The fundamental problem here is that Doom 3 sets the alpha bits for some assets, most notably the menu and some in game GUIs. With GLX these are pseudotransparent to the background of the GL context, the clear color or something similar. With EGL they're transparent in the literal sense of the word, leading to all kind on interesting render problems. As a work around r_waylandcompat was added. When enabled, the alpha bits of the GL context are forced to 0. While this solves the direct problem, it causes a lot of other render problems... At least on Gnome 4.41 with r_waylandcompat 0 the game is playable, with r_waylandcompat 1 it's more or less broken.

Luckily there's a solution for this. Mesa 21.3.0 added the EGL_EXT_present_opaque extension which can be used to force the presentation thing to be opaque. That's the same behavior as GLX had. And SDL 2.0.18 will support it. It's exposed by the SDL_HINT_VIDEO_EGL_ALLOW_TRANSPARENCY hint, set to SDL_TRUE by default. Nvidia already said that they also support that EGL extension. I don't know if they've already implemented it or if it will be part of later update.

So: Let's remove r_waylandcompat, it does more harm than good. Users who run the game under wayland (which must be forced by setting SDL_VIDEODRIVER=wayland aynway) should use an up to date Mesa and SDL instead.

@Yamagi
Copy link
Contributor Author

Yamagi commented Nov 22, 2021

@DanielGibson
Copy link
Member

I wonder if the underlying issue can be fixed in dhewm3 so neither is necessary. By making sure that the alpha channel doesn't contain the wrong value or whatever exactly causes this

@DanielGibson
Copy link
Member

I investigated the underlying issue a bit - the underlying issue is "why the hell does the window need alpha bits, it's not supposed to be translucent?!".

So I set r_waylandcompat 1, which makes dhewm3's GLimp_Init() call SDL_GL_SetAttribute(SDL_GL_ALPHA_SIZE, 0); instead of SDL_GL_SetAttribute(SDL_GL_ALPHA_SIZE, 8);, and tried to figure out why it looks wrong.

For reference, this is what the main menu is supposed to look like:
image
and this is what it looks like with SDL_GL_ALPHA_SIZE set to 0 instead of 8 (at least with most drivers), i.e. with r_waylandcompat 1:
image

Some observations:

  • I could reproduce this bug on Linux X11 with nvidias and Intels drivers and also on the Raspberry Pi 4 with its Linux driver (also X11). I didn't try AMD on Linux and I didn't try Wayland.
  • I could also reproduce it on Windows with nvidias drivers.
  • I could not reproduce it on Windows with AMDs drivers, but as far as I can tell that's because AMD will give us an OpenGL context with 8 alphabits even if 0 were requested (which is not wrong, it's supposed to use the closest "visual" to the one requested, but also not helpful because it means I can't debug it with AMD-specific tools).
  • Debugging this is annoying as hell, because there is hardly any GPU debugging tool that supports "legacy" OpenGL, meaning OpenGL before 3.2, sometimes even 4.x, or compatibility-profiles in general. For example, RenderDoc and nvidia NSight only support "modern" OpenGL core profiles. Not sure about AMDs tools (like "Radeon GPU Analyzer" or CodeXL) - on first sight I haven't found anything explicitly saying they don't support legacy GL, but as I can't reproduce the issue with AMDs Windows drivers there was no reason to further look into this..
    The only tool I'm aware of that can help is apitrace, but unfortunately when I replay a trace that looked wrong when recording it, it looks right in apitrace, so I'll have to fix apitrace first.. also, compared to RenderDoc, apitrace isn't that much fun to use, but OTOH it's certainly better than nothing so it'll have to do.

@DanielGibson
Copy link
Member

DanielGibson commented Jan 17, 2022

Ok, the problem with apitrace was that its retrace tool (used to get screenshots of the frames and the buffers and further information about the OpenGL state at a specific gl-call) doesn't really use the same "visual" (settings for bits per color channel etc) as the traced program: https://github.com/apitrace/apitrace/blob/764c9786b2312b656ce0918dff73001c6a85f46f/retrace/glws_glx.cpp#L265-L273
So it always sets 8 alphabits, which explains why the replay doesn't show the issue..

I added a quick hack, that allows setting an environment variable to set the alpha bits:

diff --git a/retrace/glws_glx.cpp b/retrace/glws_glx.cpp
index 749b58dd..dac3f17f 100644
--- a/retrace/glws_glx.cpp
+++ b/retrace/glws_glx.cpp
@@ -262,12 +262,18 @@ createVisual(bool doubleBuffer, unsigned samples, Profile profile) {
     GlxVisual *visual = new GlxVisual(profile);
     Attributes<int> attribs;
 
+    int alphabits = 8;
+    const char* alphaBitEnv = getenv("APITRACE_ALPHA_BITS");
+    if(alphaBitEnv != NULL) {
+        alphabits = atoi(alphaBitEnv);
+    }
+
     attribs.add(GLX_DRAWABLE_TYPE, GLX_WINDOW_BIT);
     attribs.add(GLX_RENDER_TYPE, GLX_RGBA_BIT);
     attribs.add(GLX_RED_SIZE, 8);
     attribs.add(GLX_GREEN_SIZE, 8);
     attribs.add(GLX_BLUE_SIZE, 8);
-    attribs.add(GLX_ALPHA_SIZE, 8);
+    attribs.add(GLX_ALPHA_SIZE, alphabits);
     attribs.add(GLX_DOUBLEBUFFER, doubleBuffer ? GL_TRUE : GL_FALSE);
     attribs.add(GLX_DEPTH_SIZE, 1);
     attribs.add(GLX_STENCIL_SIZE, 1);

So I can start the apitrace GUI with APITRACE_ALPHA_BITS=0 ./qapitrace and will get to see the behavior with 0 alphabits - nice.
Also handy that this allows me to use the exact same trace to compare how it looks at a specific stage with 0 vs 8 alphabits.

What I found out was: before the OpenGL drawcalls that made it look wrong, there were calls to to glBlendFunc(GL_DST_ALPHA, GL_ONE_MINUS_DST_ALPHA); - while most of the time glBlendFunc() is called with arguments like GL_SRC_ALPHA, GL_ON_MINUS_SRC_ALPHA.

This makes sense - the source for blending is the currently rendered polys and their texture, the destination is the Default Framebuffer of the OpenGL context (more specifically GL_BACK) - which has 0 alphabits (=> no alpha channel), as requested.
So (as far as I understand) it's to be expected that GL_DST_ALPHA or GL_ONE_MINUS_DST_ALPHA don't work properly and look different than when the Default Framebuffer does have an alpha channel.

So what can we do about this?

I guess one possibility is to use a Framebuffer Object (configured with an alpha chan) to render to, at least on hardware that supports GL_EXT_Framebuffer_object (or ideally GL_ARB_framebuffer_object), and only set alpha bits on the window/context on hardware that doesn't.
While I assume that basically all halfway-relevant hardware supports this, esp. one that might use Wayland (it's part of OpenGL 3.0, and probably was well supported before with the extension), I have no idea whether this has any performance implications.
And I don't know how much work it would be to actually implement this..

@Yamagi
Copy link
Contributor Author

Yamagi commented Jan 17, 2022

Some screenshots to illustrate the problem. Let's start with SDL 2.0.20 with SDL_HINT_VIDEO_EGL_ALLOW_TRANSPARENCY set to 1. That's the behavior with SDL <2.0.18 and SDL >= 2.0.18 on system with Mesa <21.3. The screenshot was taken with the Gnome screenshot tool:
broken

The actual output on screen looks even worse. This is a photo taken with my phone:
phone

With SDL_HINT_VIDEO_EGL_ALLOW_TRANSPARENCY set to 0 (the default) the transparency problem is gone but the menu still has it's problems:
unbroken

And the intro sequence doesn't relay play, it renders static as soon as it enters Doctor Betrugers command bunker. Several in game GUIs like the advertisement right after passing the health scan are broken, too:

static

@DanielGibson
Copy link
Member

DanielGibson commented Jan 17, 2022

With SDL_HINT_VIDEO_EGL_ALLOW_TRANSPARENCY set to 0 (the default) the transparency problem is gone but the menu still has it's problems:

Hmm that looks pretty much like X11 (or Windows+nvidia) with 0 alpha bits..

And the intro sequence doesn't relay play, it renders static as soon as it enters Doctor Betrugers command bunker. Several in game GUIs like the advertisement right after passing the health scan are broken, too:

In the main menu those bars where the Doom3 logo should be are a texture that actually looks like that and is supposed to be blended onto the logo (with glBlendFunc(GL_DST_ALPHA, GL_ONE_MINUS_DST_ALPHA)) - if the framebuffer does have alpha, it's barely noticeable, just some (animated) static/glitch/tearing-like animated/scrolling bars.

I assume that it's similar for the static in front of the video/cutscene: Probably it's just supposed to be some subtle static effect on top of the real scene, but for some reason the framebuffer doesn't have alpha so the blending with "destination alpha" makes it opaque.

UPDATE: Some things worth testing: latest SDL git; making SDL use native wayland vs xwayland - and checking the dhewm3 console output (or the newly added ~/.local/share/dhewm3/dhewm3log.txt) for the Requested 8 color bits per chan, 8 alpha 24 depth, 8 stencil and Got 8 stencil bits, 32 depth bits, color bits: r8 g8 b8 a8 lines.

@Yamagi
Copy link
Contributor Author

Yamagi commented Jan 17, 2022

Another try with SDL build from git, the exact commit was b06866ef9717d57ced4443d90ab8f0a5105be07a. dhewm3 was started with a clean config to rule out any configuration problems, r_waylandcompat is definitively set to 0. SDL reports that the framebuffer has 8 alpha bit:

Requested 8 color bits per chan, 8 alpha 24 depth, 8 stencil
Got 8 stencil bits, 24 depth bits, color bits: r8 g8 b8 a8

But apparently it hasn't.

Exactly same setup under xwayland looks good. The menu renders correctly, the cutscene and the in game GUIs are working like they should.

Some more information regarding my system:

  • AMD Radeon RX 5700XT
  • Linux 5.16.1 Mesa 21.3.4 on Arch Linux
  • Window system is Gnome 41.3 with Wayland

@DanielGibson
Copy link
Member

DanielGibson commented Jan 17, 2022

Thank you very much!

Mentioned the issue at libsdl-org/SDL#4306 (comment)

@Yamagi
Copy link
Contributor Author

Yamagi commented Jan 18, 2022

I've opened a mesa bugreport: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5886

@DanielGibson
Copy link
Member

I pushed a workaround (it tells SDL2 to not use EGL_EXT_present_opaque and makes sure that the windows default framebuffer is opaque at the end of a frame).
See 699779e for details

rorgoroth pushed a commit to rorgoroth/dhewm3 that referenced this issue Apr 8, 2023
For some reason Wayland thought it would be clever to be the only
windowing system that (non-optionally) uses the alpha chan of the
window's default OpenGL framebuffer for window transparency.
This always caused glitches with dhewm3, as Doom3 uses that alpha-chan
for blending tricks (with GL_DST_ALPHA) - especially visible in the main
menu or when the flashlight is on.
So far the workaround has been r_waylandcompat which requests an OpenGL
context/visual without alpha chan (0 alpha bits), but that also causes
glitches.
There's an EGL extension that's supposed to fix this issue
(EGL_EXT_present_opaque), and newer SDL2 versions use it (when using
the wayland backend) - but unfortunately the Mesa implementation is
broken (seems to provide a visual without alpha channel even if one was
requested), see https://gitlab.freedesktop.org/mesa/mesa/-/issues/5886
and libsdl-org/SDL#4306 (comment)
for the corresponding SDL2 discussion

To work around this issue, dhewm3 now disables the use of that EGL
extension and (optionally) makes sure the alpha channel is opaque at
the end of the frame.
This behavior is controlled with the r_fillWindowAlphaChan CVar:
If it's 1, this always is done (regardless if wayland is used or not),
if it's 0 it's not done (even on wayland),
if it's -1 (the default) it's only done if the SDL "video driver" is
  wayland (this could be easily enhanced later in case other windowing
  systems have the same issue)

r_waylandcompat has been removed (it never worked properly anyway),
so now the window always has an alpha chan
@fooishbar
Copy link

Sorry that extension was totally broken. It's fixed in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27709 which should be shipped in Mesa 24.1 and hopefully some 24.0.x point releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants