Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TRACKER] GPU rendering issues / app crashes #72

Open
30 of 37 tasks
asahilina opened this issue Dec 7, 2022 · 312 comments
Open
30 of 37 tasks

[TRACKER] GPU rendering issues / app crashes #72

asahilina opened this issue Dec 7, 2022 · 312 comments
Assignees

Comments

@asahilina
Copy link
Member

asahilina commented Dec 7, 2022

This is a tracker bug for general GPU issues, like:

  • Apps that crash after startup
  • Rendering glitches
  • GPU fault/timeout errors

When making a comment on this bug, please run the asahi-diagnose command and attach the file it saves to your comment. Please tell us what you were doing when the problem happened, what desktop environment and window system you use, and any other details about the issue.

The purpose of this bug is to collect reports of app issues in one place, so we have somewhere to look when figuring out what to work on. Since the driver is still a work-in-progress and lots of things are not expected to work, please don't expect a timely response to reports. We're working on it!

Before reporting something, please check that the issue has not been reported already. Duplicate reports just clutter up the bug and will be marked as duplicate. --marcan

  • If you are having shader errors with Chromium/Electron-based apps after an update, delete your shader cache: rm -rf ~/.config/chromium/Default/GPUCache (or similar paths for other apps). Upstream bug. This is not a driver bug.

  • If you run into a GPU lockup or crash (all GPU apps stop working, but you can still SSH into the machine), please open a new bug in this repo, tell us what you were doing when the GPU locked up, and attach the asahi-diagnose log.

  • If you get GPU fault or GPU timeout messages in dmesg (probably together with rendering issues), but the GPU keeps working in general, this tracker bug is the right place to report that.

  • If you have linux-asahi-edge issues unrelated to the GPU, please report them here. This includes display output/controller related issues, like screen resolution switching and backlight control, which are unrelated to the GPU driver.

  • If you are seeing single-pixel-wide glitches, please set your screen scale to 100%, log out and back in, and try to reproduce it again. These kinds of glitches are likely to be compositor/desktop environment bugs related to fractional scaling, rather than driver issues.

If you see magenta

Magenta is the error color on Apple GPUs. It is what you get when you sample an uninitialized compressed texture. This often happens with driver bugs that break rendering, but there are also many apps that have bugs that transiently display uninitialized buffer contents. These will often show up as black or transparent on other GPUs or with software rendering, which stands out less but it indicates the same bug.

If you see magenta glitches, please try running the app with ASAHI_MESA_DEBUG=nocompress. If you see the same problems but they are now black, try LIBGL_ALWAYS_SOFTWARE=true to force software rendering. If you get the same results (still black regions where previously there was magenta), then it is likely an app bug or an upstream Mesa bug, not a driver issue.

Another common issue is apps that have rendering feedback loops, which are undefined behavior in OpenGL. These often result in 4x2 pixel shaped corruption regions. You can work around this with ASAHI_MESA_DEBUG=nocompress, which should fix the issue (at least if it wouldn't normally break on all GPUs). This could also be caused by a driver bug, though, so please do report anything that is fixed with nocompress so we can take a look and determine whether it's an app bug or a driver bug!

Known issues

Resolved issues

  • System Monitor glitches in the History tab
  • Plasmashell sometimes shows magenta areas or missing rendering
  • Portraits render incorrectly or cause faults in Darwinia
  • Geometry edges render poorly/jagged in Darwinia
  • WebGL Aquarium faults above 10000 fish
  • Xorg sometimes flashes black while switching windows in KWin
  • gl_FragDepth is not implemented
  • three.js scenes (and other complex renders) with MSAA glitch on Pro/Max/Ultra machines
  • Discard regressed with the OpenGL 3.1 update (breaking Darwinia)
  • MSRTT wrongly advertised when not properly supported (making Darwinia MSAA not work)
  • Texture barriers wrongly advertised when not properly supported (X11/Emacs glitching)
  • Water glitches near screen edges in Darwinia if you get up really close.
  • (6.2/explicit sync regression) KWin flashes magenta rectangles when starting up, or right before going to sleep.
  • Nautilus is pink with Adwaita
  • Google Maps & PDF viewer on Firefox use too much memory
  • Corrupted X11 apps on GNOME/XWayland
  • Rendering with some render targets disabled regressed (Inochi2D, KiCad 3D glitching)
  • Figma rendering is glitchy
  • CSS transforms in Firefox render incorrectly
  • Complex GTK apps (GIMP, etc.) under XWayland sometimes glitch magenta
  • Register spilling is excessively slow (example)

Issues that aren't driver bugs

  • gnome-terminal has a glitchy background (upstream bug from 2 years ago, they don't seem interested in fixing it...)
  • Moving the mouse above blurred backgrounds in KWin/Wayland causes artifacts (upstream bug)
  • KWin has single-pixel glitches when fractional scaling is enabled (Fixed in plasma 5.27.7)
  • OBS screen sharing does not work (PipeWire regression, fixed in 0.3.62, plus KWin bug (merged for 5.26.5, ETA Jan 3), plus core Mesa bug)
  • Some WebGL apps on Firefox (like Plex and QuakeJS) can hang/fail to render (upstream bug, fixed for Firefox 110)
  • Firefox sometimes flashes magenta on startup (upstream bug)
  • SuperTuxKart sometimes has rectangular black glitches (upstream bug) (worked around in driver for now)
  • QuakeJS has jittery geometry (emscripten bug, fixed years ago but they need to update). Probably applies to any WebGL apps with similar issues too.
  • Window corruption / magenta regions with Java OpenGL rendering enabled (Java does not double buffer its OpenGL visual so it will always tear/break on any modern system from this decade, just worse on Apple with compression.)
  • Google Sheets gets blurry on scroll (Google Sheets bug Firefox bug) (workaround: ASAHI_MESA_DEBUG=no16)
  • blender's gl version check is sketchy https://bugzilla.redhat.com/show_bug.cgi?id=2237821
  • xeyes has a magenta/nontransparent background in kwin and wlroots compositors (they do not implement X11 SHAPE, see wlroots issue; GNOME/mutter works).
  • GTK4 apps have visual corruption with non-integer scales (GTK4 bug, patched in Fedora)
  • Faults and mipmap-related artifacts in GTK4 apps (GTK4 bug, worked around in Fedora Asahi Remix with GSK_RENDERER=ngl)
@BarisUlas

This comment was marked as off-topic.

@Chainfire

This comment was marked as resolved.

@aykevl

This comment was marked as resolved.

@TheBrinkOfTomorrow

This comment was marked as resolved.

@mkurz

This comment was marked as duplicate.

@erenatas

This comment was marked as off-topic.

@jannau

This comment was marked as off-topic.

@fredlahde

This comment was marked as resolved.

@BarisUlas

This comment was marked as off-topic.

@marcan
Copy link
Member

marcan commented Dec 9, 2022

Please do not add "me too" comments, as they just clutter everything up. Once an issue is reported, it's enough, we don't need more reports (unless you're going to add useful info, like more detailed/reliable repro steps --Lina).

Please don't report issues with nonstandard unsupported firmware. If you picked expert mode in the installer and then chose the wrong non-default options, you're on your own. You'll have to reinstall to fix it. It's called expert mode for a reason - if you break it you get to keep the pieces.

@asahilina

This comment was marked as resolved.

@asahilina

This comment was marked as resolved.

@asahilina
Copy link
Member Author

I noticed that GitHub is showing comments we mark as resolved as simply "hidden". We're just hiding them to declutter the bug, don't worry! Please look for that wording (This comment has been hidden.) if you're looking for previous comments reporting issues that have since been fixed ^^

@iaguis

This comment was marked as resolved.

@fredlahde

This comment was marked as resolved.

@fredlahde

This comment was marked as resolved.

@TellowKrinkle

This comment was marked as resolved.

@Nnubes256

This comment was marked as resolved.

@Capta1nT0ad

This comment was marked as resolved.

@marcan

This comment was marked as resolved.

@levihuayuzhang

This comment was marked as resolved.

@aykevl

This comment was marked as resolved.

@asahilina

This comment was marked as resolved.

@asahilina

This comment was marked as resolved.

@aykevl

This comment was marked as resolved.

@asahilina

This comment was marked as resolved.

@asahilina

This comment was marked as resolved.

@asahilina
Copy link
Member Author

asahilina commented Nov 9, 2024

New issue: GTK4 apps have visual corruption when using noninteger scales on Wayland. I believe this is a GTK4 bug, not a driver bug (it fails to properly clear a buffer, leading to undefined behavior). It started happening recently because KWin now supports fractional scaling on Wayland, but it is not a regression.

Edit: Filed here: https://gitlab.gnome.org/GNOME/gtk/-/issues/7146

@mkurz
Copy link

mkurz commented Nov 9, 2024

@asahilina I think following two bugs in your list above can be marked as resolved(?):

  • Google Maps on Firefox is slow/jerky
  • Moving the mouse above blurred backgrounds in KWin/Wayland causes artifacts (upstream bug)

@mkurz

This comment was marked as resolved.

@alyssarosenzweig

This comment was marked as resolved.

@mkurz

This comment was marked as resolved.

@alyssarosenzweig

This comment was marked as resolved.

@Nefsen402

This comment was marked as resolved.

@alyssarosenzweig
Copy link
Member

Repro'd with WindWakersWaves.dff from dolphin fifo.ci although I have no idea why this is happening only on Vulkan

@alyssarosenzweig
Copy link
Member

out.mp4

I get vertex explosions when trying to play The Legend of Zelda: The Wind Waker through dolphin-emu with vulkan. It works fine in OpenGL.

The issue is not limited to that one game. It seems that whenever the emulator tries to render a certain graphical effect, everything will vertex explode. This effect seems to be common as it happens on everything I've tried so far.

Thank you for the bug report! https://rosenzweig.io/0001-hk-fix-primitive-restart-dirty-tracking.patch fixes on my end. This will be resolved in the next driver update (likely in the new year, given $holidays).

@Nefsen402
Copy link

Tested-by: Alexander Orzechowski <alex@ozal.ski>

It looks like whatever branch you created that patch on has diverged quite a but from https://gitlab.freedesktop.org/asahi/mesa/-/commits/main?ref_type=HEADS. At any rate, it's trivial to fix up the patch to instead apply on main. The uint32_t restart_index should go on the hk_cs struct instead of hk_graphics_state. With that, it will build.

Also PS, I spent an hour pulling my hair out trying to figure out why LLVMSPIRVLib had a version my system didn't accept. Turns out I had the llvm package installed in fedora which is llvm19 but LLVMSPIRVLib was only compiled for 18 so meson complained. Removing those packages made meson happy.

@alyssarosenzweig
Copy link
Member

alyssarosenzweig commented Dec 20, 2024

Tested-by: Alexander Orzechowski <alex@ozal.ski>

It looks like whatever branch you created that patch on has diverged quite a but from https://gitlab.freedesktop.org/asahi/mesa/-/commits/main?ref_type=HEADS. At any rate, it's trivial to fix up the patch to instead apply on main. The uint32_t restart_index should go on the hk_cs struct instead of hk_graphics_state. With that, it will build.

Also PS, I spent an hour pulling my hair out trying to figure out why LLVMSPIRVLib had a version my system didn't accept. Turns out I had the llvm package installed in fedora which is llvm19 but LLVMSPIRVLib was only compiled for 18 so meson complained. Removing those packages made meson happy.

Awesome! And yeah, sorry, I need to rebase asahi/mesa, I've been working on a development tree branched off upstream Mesa because of $DAY_JOB stuff and am a bit all over the place. Should've pushed you a branch to try instead, sorry about that!

Incidentally llvm 19 builds are fixed upstream / in my branch, which is another reason I need to resync asahi/mesa...

@asahilina
Copy link
Member Author

Also PS, I spent an hour pulling my hair out trying to figure out why LLVMSPIRVLib had a version my system didn't accept. Turns out I had the llvm package installed in fedora which is llvm19 but LLVMSPIRVLib was only compiled for 18 so meson complained. Removing those packages made meson happy.

For future reference, the way you fix that is LLVM_CONFIG=llvm-config-18 meson ... when configuring Mesa.

@dupontinquiries

This comment was marked as resolved.

@jvoisin

This comment was marked as resolved.

@jannau

This comment was marked as resolved.

@jannau
Copy link
Member

jannau commented Dec 31, 2024

loupe/gtk4 vulkan/hk issue reported upstream in https://gitlab.gnome.org/GNOME/gtk/-/issues/7229 with a patch which seems to fix it here

@odlg
Copy link

odlg commented Jan 1, 2025

I am on Fedora 41 and seeing "DCP has crashed" errors when using a 4K display attached via HDMI on my MacBook Pro M2 Pro. It happens with kernel 6.12.4-400.asahi.fc41.aarch64+16k - I have not seen it with earlier kernels. It is easily reproduceable. It crashes within an hour or so.
When using 6.12.1-404.asahi.fc41.aarch64+16k I do not see the issue.
Does this bug report belong here, or should it be separate?
dmesg.dcp-has-crashed.log

@Doublonmousse

This comment was marked as resolved.

@jannau

This comment was marked as resolved.

@Doublonmousse

This comment was marked as resolved.

@asahilina

This comment was marked as resolved.

@Doublonmousse

This comment was marked as resolved.

@jvoisin

This comment was marked as off-topic.

@hackgrid

This comment was marked as off-topic.

@allmazz

This comment was marked as off-topic.

@asahilina
Copy link
Member Author

asahilina commented Jan 10, 2025

@allmazz Those messages are normal, and this issue isn't the right place for DCP problems either. Unless you have a reason to believe it's a GPU driver problem, please file the issue somewhere else. It could be a GNOME problem unrelated to the Asahi remix.

@odlg Please report DCP bugs as a separate issue. This issue is for GPU issues, DCP isn't the GPU. If you have an easy repro then that would be great ^^

@asahilina
Copy link
Member Author

When browsing this webpage, the 2D graphics library rendering is correct, but the Second Attempt: WebGL graphics one shows artifacts:

That demo works for me on another platform but the two further down the page don't. If @hackgrid says the second demo breaks the same way on Windows then I think it's fair to say the shaders/rendering code on that page are probably buggy.

@asahilina
Copy link
Member Author

asahilina commented Jan 10, 2025

As for the GNOME issues, there are several GNOME bugs involved but as I understand it, they've disabled color management by default now, so that works around the first bug. On our side, we're now defaulting to the GL renderer to work around the second issue. So I'm going to mark the comments as resolved here for now.

Once they fix the remaining issues and roll out a release, we can try the Vulkan renderer again and see if anything else is broken or we can re-enable it.

@dupontinquiries some of the fixes/workarounds won't apply to Flatpak since we don't control that GTK4 release. If you can get the environment variables GDK_DISABLE=color-mgmt GSK_RENDERER=ngl passed to the app, that should work around the issue. Please leave another comment if it doesn't.

@dupontinquiries
Copy link

dupontinquiries commented Jan 10, 2025

@dupontinquiries some of the fixes/workarounds won't apply to Flatpak since we don't control that GTK4 release. If you can get the environment variables GDK_DISABLE=color-mgmt GSK_RENDERER=ngl passed to the app, that should work around the issue. Please leave another comment if it doesn't.

that did remove the visual artifacts. thanks
it seems that there are no longer visual artifacts when I run the app normally now as well... a system update must have provided a fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests