Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two GPUs, One monitor per GPU low performance #281

Open
Quackdoc opened this issue Jan 19, 2024 · 19 comments
Open

Two GPUs, One monitor per GPU low performance #281

Quackdoc opened this issue Jan 19, 2024 · 19 comments

Comments

@Quackdoc
Copy link
Contributor

Quackdoc commented Jan 19, 2024

On my desktop I currently have two GPUs, and each GPU has a single display attached to it. Copying to the second display is quite slow The framerate is noticeably worse and jumpy (though my monitor doesn't have the facilities to tell me how much worse). This is made even more slow when I use the second GPU to render an application. (IE. using mpv with --vulkan-device).

primary GPU: intel arc a380
Secondary GPU: RX 580 4Gb
Commit: e569e14

EDIT: it's also worth noting that having some other apps on the second screen will also slow down the primary display too after some time, closing the apps or moving it to the primary display for some time will fix the perf regression. The actually performance going down seems to be sporadic at best though, its been very hard to find a replication case, but it seems to have with telegram more often then not

@ids1024
Copy link
Member

ids1024 commented Jan 19, 2024

Interesting. Do you happen to know how performance compares with the same hardware on other Wayland compositors, like Gnome Wayland?

Intel + Nvidia currently seems to perform fine with recent Nvidia generations, but not so well with Pascal and Turing cards (on either cosmic-comp or gnome-shell). But I haven't seen what performance is like with Intel Arc + AMD.

@Quackdoc
Copy link
Contributor Author

I haven't tested gnome, but Kwin/KDE is fine in terms of performance. Sway has a weird performance issue when using the arc gpu as a primary GPU. but that is a Sway + arc specific issue. i've noticed it even when the AMD gpu is disabled. I can test out gnome a but later if that will be helpful.

@Drakulix
Copy link
Member

Drakulix commented Jan 19, 2024

Copying to the second display is quite slow.

Note that it shouldn't copy from one to the other, if there are no applications running on the other gpu anyway.

If you want to visualize, what gpu is used, you can enable the debug feature of cosmic-comp, when building, which will enable a debug-overlay which shows the rendering gpu and frame times. (Note it sometimes glitches out with multiple gpus, but in general it is quite usable and not meant for end-users anyway.)

Additionally there is an experimental flag to be toggled with an environment variable (COSMIC_RENDER_AUTO_ASSIGN=y, e.g. inside $HOME/.profile), which will set apps to run on the gpu who's output they were spawned on, hopefully minimizing cases where contents need to be copied between gpus.

None of these things are solutions to your problem, but they might help to discover more specific circumstances, which cause these performance characteristics you are seeing.

If you could test and narrow this down somewhat (e.g. only happens when apps running on the intel gpu are visible on the amd output and the debug overlay shows the intel gpu responsible for compositing), that makes it easier to reproduce and find problems. Comparing to KDE is already quite helpful, if I had some more clear description of the conditions (like given in the example) I could compare code paths with KWin to figure out potential problems of our code with your configuration.

@Quackdoc
Copy link
Contributor Author

Cosmic screenshot doesn't seem to capture the overlays, and it doesn't seem like there is a way to log the performance in a profile, so here are my findings in written form for the testing.

All testing was done using firefox with multiple open tabs, (19-20 open 5 or so loaded), Kitty terminal emulator, and the telegram client from the arch store, but previously flatpak had the same issues as well as MPV.. I will look for a more consistent reproduction case.

First findings when using COSMIC_RENDER_DEVICE=/dev/dri/renderD128

unfortunately it seems like the debug overlay actually has a significant performance impact. When I have firefox running fullscreen having the overlay enabled will bring the performance right down to 20fps. Thankfully opening another window will cause perf to go back up.

Even when I force MPV to render using the AMD gpu using VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/radeon_icd.x86_64.json mpv Cosmic-comp's debug overlay still shows the render device using D128.

frametimes throughout the test were spotty, but I actually haven't been able to replicate the issue with the debug feature enabled. I will keep testing and report back if it does happen however

as I was making this I caused it to happen using telegram on the second display, and firefox, kitty, and mpv open on the primary display.

Before closing telegram
Primary

average fps 25
avg ft 0.020xxxx
min ft 0.002548
max ft 0.236963

secondary

avg fps 18
avg ft 0.111993
min ft 0.002261
max ft 0.180663

after closing telegram
primary

avg 22fps 
avg ft 0.031xxx
min ft 0.002587
max ft 0.268032

secondary

average fps 25
avg ft 0.001010
min ft 0.000621
max ft  0.001874

Performance didn't fix itself until I closed the debug overlay, sadly when I open the debug overlay back, the performance regression re-occurs.

when using COSMIC_RENDER_AUTO_ASSIGN=y

first thing to note was cosmic-panel was riddled with graphical corruption (picture provided)

Performance was terrible. The primary window was rendering at about 20fps and the secondary one rendering at 17fps. However each one was riddled with massive frame spikes and pacing issues. On the primary display;

average frametimes was 0.025xxx
min frametime was 0.002172
max  frametime was 0.244019

On the secondary display with nothing running were

average frame times were 0.001031
min 0.000464
max 0.001660

after opening telegram on the second display

average frame times 0.00163x
min 0.00813
max 0.002526

After moving the telegram to the primary display and firefox to the secondary I ran into what looks like pixel format corruption on telegram (which was rendering on the amd gpu) but none on firefox. pictures provided That being said, the performance actually went up on both monitors. on the primary display

averages went to 0.0035xx
min 0.002223
max 0.012617

secondary display

average 0.012xxx
min 0.002563
max 0.0634xx

PS. reading the debug information was actually quite difficult for me, it would be nice if there was a black translucent or opaque background behind the debug information

cosmic-corruption

@ids1024
Copy link
Member

ids1024 commented Jan 19, 2024

Even when I force MPV to render using the AMD gpu using VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/radeon_icd.x86_64.json mpv Cosmic-comp's debug overlay still shows the render device using D128.

You can trying running Wayland clients with WAYLAND_DISPLAY=wayland-1-renderD129, which is a bit of a hack cosmic-comp currently has to indicate what GPU to advertise to clients.

I'm not familiar with the Vulkan code, but for EGL/OpenGL on Wayland, with DRI_PRIME, Mesa seems to always use the main device advertised by the Wayland compositor as the "display GPU". So if it uses prime, the client will copy from the render GPU to a linear buffer on that GPU, even when the window is going to be displayed on a monitor attached to that GPU.

Some improvement is still needed in Mesa (and probably the dmabuf Wayland protocol) to better handle multi-gpu on Wayland. Though if Kwin works better, cosmic-comp presumably isn't handling this case optimally somehow.

@Quackdoc
Copy link
Contributor Author

I was able to test some on sway, Sway seems to be working too aside from the just overall high gpu usage.

@Quackdoc
Copy link
Contributor Author

Quackdoc commented Jan 20, 2024

update, I built #278 and it seems to have fixed the issue, not sure if due to the patch itself or the smithay branch it is tracking, but I've used it for about 2 hours now doing the exact same thing I was before and haven't managed to replicate the issue using COSMIC_RENDER_DEVICE=/dev/dri/renderD128

EDIT: I can even play a video on the second monitor without causing any sort of sluggishness on the primary monitor. so this branch is for sure a massive improvement. I just went back and checked against master and the issues are still there

EDIT2: It seems like firefox specifically renders poorly when a window on the second is focused. Chrome doesn't seem to have this issue, This seems like a firefox issue, but it's not something I've encountered before on other compositors. Both chromium and MPV seem fine though so it has me a bit puzzled

@Drakulix
Copy link
Member

update, I built #278 and it seems to have fixed the issue, not sure if due to the patch itself or the smithay branch it is tracking, but I've used it for about 2 hours now doing the exact same thing I was before and haven't managed to replicate the issue.

That is very intesting, given that PR shouldn't really touch on any parts affecting this.

Except for a hack I added for testing, which disables the vulkan allocator code path I added at some point to fix problems with the nvidia driver. Can you test master_jammy with just this commit picked from said PR: c4ba323

If that also doesn't replicate the issue, then this is a huge discovery.

@Quackdoc
Copy link
Contributor Author

Quackdoc commented Jan 22, 2024

Sorry for the late reply, It also does not replicate the issue with it

EDIT: interestingly enough, when I cherrypick the commit, the issue does specifically go away, however another issue pops up MPV when running fullscreen, and a window on the second display is focused, with some gpu load it can somewhat reliable cause MPV to do some sort of screen tearing.

I was able to get a reliable test case by running MPV fullscreen on the primary monitor, and glxgears as well is telegram side by side on the second monitor, while glxgears is running scroll a lot on telegram and move the mouse around. MPV on the main window will have some sort of tearing artifacts

This issue does not happen when I compile the full branch.

@Drakulix
Copy link
Member

Thanks for those infos. #278 is going to merged one way or another soon, but I will make some more tests regarding that specific commit to figure out if disabling that code-path regresses any other configurations and will merge a fix for the problematic performance behavior asap.

Thanks for all the help debugging!

@Quackdoc
Copy link
Contributor Author

as an aside, I did try to alleviate the issue by setting context priority since Iris does support EGL_IMG_context_priority but it didn't seem to change anything. maybe a wee bit more responsive? if so it was within placebo range, so I don't think gpu being busy was the cause of it

@flukejones
Copy link

Intel + Nvidia currently seems to perform fine with recent Nvidia generations, but not so well with Pascal and Turing cards (on either cosmic-comp or gnome-shell). But I haven't seen what performance is like with Intel Arc + AMD.

From what I've seen, the low nvidia-external-out is related to the P state the dgpu is in - I wrote about the issue in https://forums.developer.nvidia.com/t/low-p-state-affects-the-display-output-smoothness/279774 (I've tested only the proprietary driver).

I know the issue isn't specifically related to what is here but I ended up here during a search before writing the above so thought it prudent to link things.

@ids1024
Copy link
Member

ids1024 commented Jan 23, 2024

#211 is the issue with the previous testing I did for Nvidia performance.

Good to know it may be related to p-states. I'll do more testing when the 550 driver is released (which has some performance fixes, but may not impact that particular issue).

@flukejones
Copy link

@Drakulix I've been testing the hybridgpu branch (at 0ecca0a) and so far everything seems to be running incredibly smooth. Very similar to KDE-6.

WGPU example (water) frame times are down to 0.95ms or less on the Nvidia output. There is the occasional drop to 1.25ms but barely ever.

@ids1024
Copy link
Member

ids1024 commented Feb 25, 2024

Slow prime performance (like 20fps for an Intel window on a 1440 Nvidia monitor) still seems to be an issue for me with the 1650 mobile on the 550.54.14, with options nvidia NVreg_RegistryDwords="RMForcePstate=0".

@TermoZour
Copy link

@Quackdoc can you offer instructions on how to set cosmic-comp and be able to switch to it like you would between Wayland/x (if that's even possible) so I can also test it on a Framework 13 with DGPU (RX 580 as well)?

I've had similar issues on GNOME as well as KDE, I tried Nobara KDE and Nobara GNOME fresh install for both and the issue was still present.
I filed an issue for mesa and mutter but given I've had the issue on KDE as well...

It's either "these DEs don't work properly because of reasons" or it's a mesa issue but they keep pointing fingers at others.

@Quackdoc
Copy link
Contributor Author

Quackdoc commented Mar 12, 2024

@TermoZour Im not sure what exactly you are asking.

If you are asking how to swap gpus, set the below env vars, one or the other, then start cosmic, to change the gpu, you need to restart cosmic;

COSMIC_RENDER_DEVICE=/dev/dri/renderD128
#COSMIC_RENDER_AUTO_ASSIGN=y

if you are asking how to swap between sessions;
CTRL+ALT+F2 to swap to a different TTY, then run start-cosmic or cosmic-session or cosmic-comp then CTRL+ALT+F# to swap back to your original setting.

if you are asking something else please elaborate

EDIT: also unrelated, i've been running intel as my primary gpu and noticed a perf regression with it I have yet to bisect, Im not sure if it's related to dual gpus or not

@TermoZour
Copy link

I was asking how to run COSMIC so I can test my setup with it.

I thought you had to compile it and switch to it somehow from the login screen like you would if you wanted to switch between Wayland and X11, but referencing your instructions, I just need a TTY with GNOME not running on it and simply start it like you would start gnome from TTY.

PS: I'm not very familiar with Linux and all it's UI components.

@Quackdoc
Copy link
Contributor Author

Most compositors can be launched via TTY, you can with kde and sway for instance, it is worth noting that there can be bugs when running multiple compositors at the same time. but I haven't encountered any with sway and cosmic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants