Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xwayland VRAM usage is still excessive when resizing X11 apps under wayland. #126

Open
shelterx opened this issue Aug 11, 2024 · 39 comments

Comments

@shelterx
Copy link

I'm not sure what the "Fix an issue causing KDE crashes, which also caused excessive VRAM usage when resizing." was supposed to fix.
Resizing X11 apps like steam still makes Xwayland VRAM usage skyrocket but seems to stop at around 1.3GB. I'm not sure exactly what component causes this but I'll leave it here.

@ofourdan
Copy link

For background, that was already reported against Xwayland here:
https://gitlab.freedesktop.org/xorg/xserver/-/issues/1687

@Bunnysword
Copy link

@thesword53
Copy link

This issue is not limited to Xwayland:

  • If you resize a Wayland window the GPU memory usage of kwin_wayland, gnome-shell or any other wayland VM will increase in the same way.
  • In KDE desktop if you are holding left mouse button to show selection and moving mouse, the GPU memory usage of plasmashell will also increase in the same way.

@shelterx
Copy link
Author

@thesword53 indeed... you are correct, how did I miss that. I resized Konsole and here's the result:
image

Good find!

Version used:
Driver: 560.31.02
egl-wayland-f30cb0e

@shelterx
Copy link
Author

shelterx commented Aug 15, 2024

This issue is not limited to Xwayland:

It's not limited to just Wayland session either, kwin_x11 also eats VRAM when resizing. I don't recall having that issue before.
So it's probably not an egl-wayland bug at all, I'll leave the issue open until it's fixed tho'.

However, kwin_x11 does release the memory after a while.. but it does it slow.

@Arcitec
Copy link

Arcitec commented Sep 23, 2024

I can confirm, I went back to X11. On Fedora Workstation 40 with NVIDIA 560.35.03 and a RTX 3090.

On Wayland my average desktop uses 11 GB / 24 GB VRAM (46%) with just a web browser open. It impairs my ability to run games or AI workloads, because basically half the card's memory is wasted. One time it even reached the point where all apps crashed because VRAM ran out.

On X11 my average desktop uses 3 GB / 24 GB VRAM (12.5%) for the same workload. Games and AI workloads run great.

The issue seems to be:

  • NVIDIA driver leaks VRAM. The amount wasted/hoarded always grows over time.
  • Xwayland is a persistent process on Wayland (it sits around forever), so the leaked VRAM from running X11 apps NEVER gets releaesd.
  • Even when my GPU needs the hoarded/leaked VRAM, the Xwayland process doesn't release it.
  • The conclusion from Xwayland devs was that the problem is from NVIDIA driver because it doesn't leak with AMD: https://gitlab.freedesktop.org/xorg/xserver/-/issues/1687

This is in addition to Wayland's other issues, such as Chromium-based browsers frequently breaking when opening new windows, causing the windows to render in a glitched way and offset by about a titlebar's height from the top of the screen, and you have to click and drag the "invisible" (totally transparent) titlebar to resize the window to get it to render properly.

And Wayland's lack of basic features such as global keyboard shortcuts/keybinds.

It's not just NVIDIA that has problems on Wayland. Most things do.

I am going back to X11 for the next 12 months and will see if Wayland is better in 2026. At least X11 is usable. :D Wayland needs more time in the bakery. Fedora plans to remove X11 by default in Fedora 41, but I'll just install it manually since Wayland is totally unusable at the moment.

@ryzendew
Copy link

I have fully tried to reproduce this issue to no Avail
If anyone has a 100% certain way please let me know. Tried on arch and fedora and Pikaos 4

@kelvie
Copy link

kelvie commented Sep 23, 2024

I can reproduce this pretty readily on KDE on arch, running nvidia 560.35.03. Just open a konsole and move resize it a bunch of times, run nvidia-smi and notice that the VRAM of kwin_wayland will go up to about 10% of the total VRAM, and kind of stop there.

Perhaps this does have to do with how nvidia is allocating, or garbage collecting, or perhaps even reporting VRAM.

@shelterx
Copy link
Author

I have fully tried to reproduce this issue to no Avail

https://streamable.com/2ufy13

This sort of demonstrates the issue, watch kwin VRAM usage after the resizing.

  • This is under Xorg so it does free it (even faster here since OBS is running it seems).
  • If you to the same thing under wayland it never gets freed, until you actually close the Konsole window. And If you resize an X11 app under Wayland, like Steam, Xwayland VRAM usage does not get freed at all.

@ryzendew
Copy link

ryzendew commented Sep 25, 2024

ok confirmed it's a thing on gnome as well
A friend on a 7800XT confirmed it happens on amd as well

adding a video https://streamable.com/ht9cu2

@kelvie
Copy link

kelvie commented Sep 25, 2024 via email

@shelterx
Copy link
Author

shelterx commented Sep 25, 2024

@kelvie are you sure? It's possible it's been like that for a while. But it's easy to miss that it happens with kwin_wayland, because if you close the window that made the vram leak. Kwin vram usage goes back to normal.

Update
Here's the 550.40.71 dev driver, so yeah, it's in 550 too.
Image

@kelvie
Copy link

kelvie commented Sep 25, 2024

@shelterx

https://gitlab.freedesktop.org/xorg/xserver/-/issues/1617

This has happened since 545.29.06 with xwayland, I had to switch all my apps to wayland native apps to combat this if I wanted my vram.

Only with 560 it started happening with kwin_wayland for me (I also had a short stint with hyprland so I don't remember when I switched back), and currently with 560, as far as I can tell, kwin_wayland never gives back the memory even after I close the windows.

@ryzendew
Copy link

https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1704 let's test this

@ryzendew
Copy link

After testing that PR the issue is semi fixed https://gitlab.freedesktop.org/-/project/371/uploads/4af729a970faa28b667669bac1b8531f/Screencast_From_2024-09-25_20-48-02.mp4 here is a video

@gilvbp
Copy link

gilvbp commented Sep 26, 2024

After testing that PR the issue is semi fixed https://gitlab.freedesktop.org/-/project/371/uploads/4af729a970faa28b667669bac1b8531f/Screencast_From_2024-09-25_20-48-02.mp4 here is a video

FYl. This is not a fix. It's only helps to debug/track/trace.

@kelvie
Copy link

kelvie commented Sep 26, 2024

Yeah, I went back and tested with 555 and 550 as well, and still the same thing, kwin_Wayland using 2.4 to 2.7GB of my 24GB vram after resizing windows, and not freeing it even after windows are closed.

@kelvie
Copy link

kelvie commented Sep 26, 2024

I've started a new topic here: https://forums.developer.nvidia.com/t/multiple-wayland-compositors-not-freeing-vram-after-resizing-windows/307939

There are multiple issues here (Xwayland, compositors) and multiple components (xorg-server, multiple wayland compositors, this repo, nvidia drivers), so hopefully we get to the bottom of this.

In summary, I've reproduced this on:

nvidia versions:

  • 560.35.03
  • 555.58.02
  • 550.78

compositors:

  • kwin_wayland
  • sway
  • weston

egl-wayland versions:

  • 1.1.9
  • 1.1.13
  • 1.1.2
  • 1.1.16

with the same test, open a terminal and resize it over and over again, close it, and check the compositor's VRAM usage using nvidia-smi

Every time it's around 2.5GB on my 24GB 4090

@shelterx
Copy link
Author

I had an old install with 525 and KDE 5, can't say I managed to reproduce it there but I had no Wayland session installed so I had to rely on the X11 test.

@shelterx
Copy link
Author

shelterx commented Sep 27, 2024

So...

  • The Xwayland VRAM allocation release issue is not present in Vulkan Dev drivers 550.40.71 and 550.40.75, it stays at around 10-13Mb.
  • And kwin_wayland does release the memory if you minimize the window you resized, probably not working as intended but it's a quick fix, try that with 560. (i'm a bit tired of switching drivers now) UPDATE: Actually all windows that uses kwin needs to be minimized...

@kelvie
Copy link

kelvie commented Sep 27, 2024

@shelterx Wow that's a trippy workaround (the minimizing one for kwin_wayland), it does seem to work, I wonder the e(gl) calls that are at work here. plasmashell doesn't seem to free it's vram, but maybe that's another issue.

@kelvie
Copy link

kelvie commented Sep 27, 2024

I'm testing this a bit more, and it seems just using the Alt+TAB switcher in kwin resets the VRAM -- very strange. Maybe something to do with how the window thumbnails are being created for that?

@cubanismo
Copy link
Collaborator

Thank you for all the reports and attempts to narrow down the issue. I believe there are actually two separate issues tracked here:

  • Excessive memory consumption by Xwayland.
  • Excessive memory consumption by Wayland compositors, e.g., kwin_wayland.

I've looked into the latter issue, and at this point it is well understood. We do not need additional information or reports of reproductions for that issue. See below for more information.

We have not been able to reproduce the issues with Xwayland/X applications with the latest version of Xwayland and latest drivers. If you are still experiencing that particular issue, please share reproduction steps (ideally starting from a clean boot), the amount of persistent memory usage you are seeing and how you are measuring it, and your system details (Run nvidia-bug-report.sh, attach the log it generates, list your Xwayland and compositor version numbers and ideally distro package versions if you're using distro packages).

For the Wayland compositor memory usage issue, there isn't a leak per-se, but the heuristics that decide which memory to retain for performance reasons aren't working optimally when presented with the OpenGL API usage typical of a Wayland compositor. While we work to develop and deploy a driver fix, I can offer this workaround:

  • Download this JSON file: 50-limit-free-buffer-pool-in-wayland-compositors.txt.
  • Edit it to replace 'kwin_wayland' with the name of your Wayland compositor if necessary.
  • Create the directory '/etc/nvidia/nvidia-application-profiles-rc.d' if it doesn't already exist on your system, and place the file there.
  • Restart your compositor (Reboot or log out/log back in).

That should resolve this class of memory usage issues within the named application. You can also duplicate the entire rule in the JSON file if you regularly switch between multiple Wayland compositors, e.g:

        {
            "pattern": {
                "feature": "procname",
                "matches": "kwin_wayland"
            },
            "profile": "Limit Free Buffer Pool On Wayland Compositors"
        },
        {
            "pattern": {
                "feature": "procname",
                "matches": "gnome-shell"
            },
            "profile": "Limit Free Buffer Pool On Wayland Compositors"
        }

@kelvie
Copy link

kelvie commented Sep 27, 2024

Thank you for this! And to be clear, we are to create a .txt file filled with JSON, and not a .json file in that directory?

Edit: Just tested the instructions as is, copied that file and placed it in the directory with a .txt extension and it's fixed! thank you!

@cubanismo
Copy link
Collaborator

We are to create a .txt file filled with JSON, and not a .json file in that directory?

The driver doesn't care what the file is called. I didn't have an extension on the file originally, but github only accepts certain file names (See https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/attaching-files), so I renamed it .txt. Name it whatever you like.

@shelterx
Copy link
Author

shelterx commented Sep 27, 2024

@cubanismo
Thank you for your reply, much appreciated.
I will try the kwin/gnomeshell workaround. I think it happens with plasmashell too tho'.

We have not been able to reproduce the issues with Xwayland/X applications with the latest version of Xwayland and latest drivers.

Not sure which driver you are referring as latest but I can't reproduce it with the latest dev drivers, however with 560.35.03 it's easy. Just resize the steam window for example.

Edit:
The workaround for kwin_wayland works here.
Additonal info, I see no VRAM spikes in KDE like @mlhhqh experienced in gnome, mentioned below.

@mlhhqh
Copy link

mlhhqh commented Sep 27, 2024

    {
        "pattern": {
            "feature": "procname",
            "matches": "kwin_wayland"
        },
        "profile": "Limit Free Buffer Pool On Wayland Compositors"
    },
    {
        "pattern": {
            "feature": "procname",
            "matches": "gnome-shell"
        },
        "profile": "Limit Free Buffer Pool On Wayland Compositors"
    }

Can confirm works on Gnome 46, Silverblue, 560.35.03

Still very subpar results. After opening a gnome session opening a terminal (Wayland) and resizing it around a bit usage spikes up to 1.4GB (from ~300mb).
Vram usage goes down very slowly (yet noticeably on user interaction like moving a window)

@ppogorze
Copy link

@cubanismo works for me! Gnome 47, CachyOS (Arch), 560.35.03. VRAM usage stays at ~400MB while resizing terminal window with nvtop open (before it was up to 1.4GB).

@shelterx
Copy link
Author

FYI, you can also add kwin_x11 if you use Xorg, it makes kwin_x11 stay on sane levels and doesn't overallocate.

@kakra
Copy link

kakra commented Sep 28, 2024

FYI, you can also add kwin_x11 if you use Xorg, it makes kwin_x11 stay on sane levels and doesn't overallocate.

Yeah, and while at it, add plasmashell, too. plasmashell is happy with under 300MB now instead of climbing up above 700MB. I also added the Xorg process itself. Not sure if it helps, seems to be a little lower (maybe 100-200MB less).

@kelvie
Copy link

kelvie commented Sep 28, 2024

FYI, you can also add kwin_x11 if you use Xorg, it makes kwin_x11 stay on sane levels and doesn't overallocate.

Yeah, and while at it, add plasmashell, too. plasmashell is happy with under 300MB now instead of climbing up above 700MB. I also added the Xorg process itself. Not sure if it helps, seems to be a little lower (maybe 100-200MB less).

We may want to start a separate topic (discussion?) on this. I've also added plasmashell and haven't noticed a difference (still uses 900MB)

@Edu4rdSHL
Copy link

Edu4rdSHL commented Sep 28, 2024

This works for process started specifically with these names (the compositor ones), but most times the “leak” occurs on another apps (electron apps are a clear example) when you resize them and the memory never gets free-ed.

It's a really nice improvement, but not yet a solution because you will need to add a pattern that does match every process name where you want to perform that operation, which can be from a couple to dozens of apps.

Anyway, thanks again for it.

@kelvie
Copy link

kelvie commented Sep 28, 2024

It's a really nice improvement, but not yet a solution because you will need to add a pattern that does match every process name where you want to perform that operation, which can be from a couple to dozens of apps.

@cubanismo did say that they're working on the fix on the driver side, as for the workaround, searching for nvidia-application-profiles-rc.d, I ran across this:

https://download.nvidia.com/XFree86/Linux-x86/384.59/README/profiles.html

So a rules file like this would apply to all processes:

{
    "rules": [
        {
            "pattern": {
                "feature": "true",
                "matches": "foobar"
            },
            "profile": "Limit Free Buffer Pool On Wayland Compositors"
        }
    ],
    "profiles": [
        {
            "name": "Limit Free Buffer Pool On Wayland Compositors",
            "settings": [
                {
                    "key": "GLVidHeapReuseRatio",
                    "value": 1
                }
            ]
        }
    ]
}

And I tested this, with it applied, losslesscut, an electron app I use, can now be resized without holding on to so much VRAM. However as @cubanismo mentions, this behaviour is like that by default for performance reasons, so presumably the tradeoff here is performance of some sort, and I won't speculate on that as I've lived through multiple decades of "self reported performance tips and tricks".

@Edu4rdSHL
Copy link

Edu4rdSHL commented Sep 28, 2024

@cubanismo did say that they're working on the fix on the driver side, as for the workaround, searching for nvidia-application-profiles-rc.d, I ran across this:
https://download.nvidia.com/XFree86/Linux-x86/384.59/README/profiles.html

Yup, I found that too, I have explicitly added all the apps that I most use there, excluding apps that rely merely on graphics (games, more exactly) to avoid possible drawbacks. It seems to be working fine, glad to see that the root problem has been found.

@Edu4rdSHL
Copy link

@cubanismo as for the "Xwayland" issue, it's mostly the same as the kwin one, see the following video for steps to reproduce:

TLDR: resize a window under wayland/xwayland several times and see how memory goes up. It's also solved if you match the process name and add the workaround you posted before, so I think it's related.

2024-08-07.23-47-26.mp4

Credits of the video: https://forums.developer.nvidia.com/t/560-release-feedback-discussion/300830/165?u=edu4rdshl

@shelterx
Copy link
Author

shelterx commented Sep 28, 2024

Hold your horses now everyone, I am going to keep this issue open until everything is fixed. I got a bit sidetracked myself here but now we have a workaround for kwin & various applications thanks to @cubanismo stepping in here.

If you want to discuss application workarounds and whatnot i suggest you to do that on the Nvidia forums.

The issue is really about the Xwayland VRAM allocation issue. So stay on that topic from now on. Thank you.

@cubanismo
Copy link
Collaborator

Thanks @Edu4rdSHL. I've seen the video, but am unable to reproduce the high memory usage seen there locally at the moment. Hence the request for additional information from those who can reproduce this with the latest Xwayland and drivers.

@shelterx, agreed, let's keep this open until there's a complete fix. Just wanted to give people options in the meantime.

@Edu4rdSHL
Copy link

Edu4rdSHL commented Oct 1, 2024

@cubanismo, thanks for your reply. Interesting that you can't reproduce it, I can reproduce it 100% of times. Being more specific, my setup is:

  • DE: Gnome 47 (Wayland)
  • OS: ArchLinux (so latest software version is installed)
  • Package versions: Nvidia 560.35.03, mesa 24.2.3, wayland 1.23.1, xwayland 24.1.2

As a side note (not trying to go offtopic), the same high Xwayland usage happens on native wayland apps too when resizing them, exactly the same, and putting rules like you proposed before, fixes it. If you like, I can open an additional issue for this same problem but for pure wayland apps, or we can change the topic to "when resizing apps under wayland".

Edit: please let me know if you need more details about the setup to reproduce.

@shelterx
Copy link
Author

shelterx commented Oct 1, 2024

@cubanismo

Regarding driver versions and Xwayland VRAM:

  • I can reproduce it on New Feature Branch 560.31.02, just resize an X11 app (steam is a good and fast example) watch the Xwayland VRAM goes up until it hits the 10% limit.
  • Adding Xwayland to application profile does workaround the issue for 560.31.02. So probably the same root cause, more or less.
  • I cannot reproduce it with Vulkan Dev 550.40.75.

Settings should be the same, all I changed is the driver.
2 monitors connected, DP and HDMI.
RTX 4070
CachyOS (arch based) - Up to date.
KDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests