Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphics Pipeline Library #2798

Closed
Arcitec opened this issue Aug 8, 2022 · 5 comments
Closed

Graphics Pipeline Library #2798

Arcitec opened this issue Aug 8, 2022 · 5 comments
Labels

Comments

@Arcitec
Copy link

Arcitec commented Aug 8, 2022

Hi, I've been following the Graphics Pipeline Library developments with great curiosity, and am amazed at the insanely high quality of work you're doing. @doitsujin You're an absolute genius!

But even though I've spent hours reading commits relating to the new library, some things are unclear to me and I've given up trying to understand them myself.

I have a few questions:

  1. For those who don't know, DirectX games basically all spend their loading screens compiling partial shader pipelines (i.e. a vertex shader, a tesselation shader, etc). And then they are combined into full pipelines during gameplay. And that's how they're stutter-free during gameplay, since the lego pieces are ready to assemble. The new Graphics Pipeline Library allows DXVK to implement most of that "pre-compiled lego pieces" behavior on Linux too. But I was unable to find out if it does that yet (or if it's planned). Meaning, when a game loading screen calls something like ID3D11Device::CreateVertexShader or ID3D11Device::CreatePixelShader, does DXVK now directly convert that to the Vulkan Graphics Pipeline Library, compiling those DirectX shader "lego pieces" during loading screens, for quick fast-linking later? If true, then we'd get similar loading screens on Linux as on Windows, and won't need a shader cache anymore. I am super excited about that possibility.
  2. How does DXVK implement the "create optimized full pipeline" calls from the new API? The Vulkan devs recommended fast-linking everything initially, while starting "full optimization" threads in the background, and only doing full optimization for heavily used shaders, and swapping the optimized versions in later in realtime. I was unable to find out how DXVK handles the optimization calls. Does it use the fast-linked versions first, to avoid hitching? How does it trigger the full optimization, and is that done in the background (asynchronously)?
  3. I noticed that DXVK blacklists geometry and tesselation shaders from being pre-compiled, which makes sense since the Vulkan team sadly decided to force those two shader types to be a combined stage where both of the shaders (geometry and tesselation) must be provided together for simultaneous compilation. So I can understand that you blacklisted them. But could this lack of pre-compiled tesselation/geometry shaders lead to still having a lot of hitching in games?

I also have two optimization ideas for the 3rd point (tesselation and geometry shaders) and wonder what your thoughts are:

  • The first (and most obvious) idea would be to wait until the game tries to create a pipeline with a tesselation and/or geometry shader, and THEN trigger a "compile geometry+tesselation pipeline" Vulkan GPL call, and then fast-link that small shader chunk to the remaining pre-compiled pipeline. But perhaps DXVK has already implemented those that way? Hopefully this (pre-compiling everything except the tesselation/geometry shaders, and using fast-linking when they are needed) would be enough to practically eliminate hitching.
  • The other idea, would be that you actually do compile tesselation and geometry shaders, as standalone "tesselation without geometry" and "geometry without tesselation" shaders in Vulkan, and then just hope that the game uses one without the other. This might be the case frequently enough in games that pre-compiling them independently this way would be a big speedup? I am not well-versed enough, but isn't it true that most games only use a tesselation shader, and almost no games use geometry shaders? If so, pre-compiling tesselation shaders as standalone Graphics Pipeline Library lego pieces might be a good idea?

Lastly, if you need real-world testing of the new code paths, I'd be willing to help out. I just need to figure out which exact NVIDIA driver I need. It's a bit of a mess. https://developer.nvidia.com/vulkan-driver has a "beta driver v515.49.10" from July 20th, but they've also done a general, public "Latest Production Branch Version: 515.65.01" from from August 2nd. I'm unsure if the 515.65.01 driver already contains the required Graphics Pipeline Library support, or if I need to install beta v515.49.10... I assume that if anyone in the world knows which driver is correct, it would be you. :)

With best regards,

Johnny

@doitsujin
Copy link
Owner

doitsujin commented Aug 8, 2022

Meaning, when a game loading screen calls something like ID3D11Device::CreateVertexShader or ID3D11Device::CreatePixelShader, does DXVK now directly convert that to the Vulkan Graphics Pipeline Library, compiling those DirectX shader "lego pieces" during loading screens, for quick fast-linking later?

Yes, it does. The implementation is pretty much done, and it can be used on Nvidia's Vulkan developer drivers (515.49.10).

The code to fast-link can be found here, the rest is a bit more complicated but it starts with the D3D9/D3D11 front-ends calling DxvkPipelineManager::registerShader when the app calls Create*Shader, and eventually a background thread will call this code which compiles the pipeline library.

Does it use the fast-linked versions first, to avoid hitching? How does it trigger the full optimization, and is that done in the background (asynchronously)?

The whole process of obtaining a new pipeline is done by this code. TL;DR:

  1. We try to create a monolithic pipeline from the driver's own shader cache using VK_PIPELINE_CREATE_FAIL_ON_PIPELINE_COMPILE_REQUIRED_BIT.
  2. If that fails, fast-link a pipeline and kick off a background thread to create an optimized pipeline.

But could this lack of pre-compiled tesselation/geometry shaders lead to still having a lot of hitching in games?

It can cause a few hitches, but tessellation pipelines are rare, even in games that use a lot of tessellation. Not the end of the world.

The first (and most obvious) idea would be to wait until the game tries to create a pipeline with a tesselation and/or geometry shader, and THEN trigger a "compile geometry+tesselation pipeline" Vulkan GPL call

I have considered this, it's not impossible, but we'd need to compile the combined pre-rasterization library at draw time anyway, so there will always be some hitching since ~half the work is still being done at the last possible moment. It might be less bad than compiling a monolithic pipeline.

Might try to implement this in the future, for the initial implementation this was just not a priority since it adds a bunch of extra complexity.

The other idea, would be that you actually do compile tesselation and geometry shaders, as standalone "tesselation without geometry" and "geometry without tesselation" shaders in Vulkan, and then just hope that the game uses one without the other.

This is not possible. Pre-rasterization pipelines need all participating stages (i,e, vertex, or vertex + tess control + tess eval, or vertex + geo, or all of them at once), we simply don't get enough info from the D3D app to do this.

@walmartshopper
Copy link

I have also been following this feature closely because I have a handful of games that are unplayable on Proton due to shader compile stuttering so I still have to dual boot to play them.

I can at least answer the driver question... as of now you still need the vulkan beta driver. I already tried the new stable 515.65.01 driver thinking it would have the new extension, but it didn't so I went back to 515.49.10 vulkan beta.

As far as I can tell, the graphics pipeline library is only in dxvk master and not in any released versions, so it's not included in any valve-released versions of proton. In theory it should be available in the latest version of proton-ge-custom which uses dxvk master, however when I tried that it broke all the games I wanted to play. So for now I'm waiting on an official dxvk release that includes the feature and then an official proton release that includes that version of dxvk, or a release of proton-ge-custom that doesn't break my games. Installing dxvk-master and using it with lutris might work but I have not tried it yet.

@K0bin
Copy link
Collaborator

K0bin commented Aug 17, 2022

I'm closing this as I think the questions have been answered.

If you have more questions, just reopen the issue and ask.

@K0bin K0bin closed this as completed Aug 17, 2022
@aufkrawall
Copy link

Perhaps doesn't warrant a dedicated report: I've noticed that Assassin’s Creed Odyssey still exhibits some notable shader compile stutter when traversing settlements/cities vs. native D3D11.

Side note: Game's performance also has regressed for me with Proton 7.0 and Experimental, which is not related to the DXVK version used. 6.3-8 with DXVK git-master yields better performance (but still the aforementioned shader compile stutter, despite Nvidia Vulkan dev driver 516.49.14).

@K0bin
Copy link
Collaborator

K0bin commented Aug 27, 2022

Shader compilation stutter can still happen if the game loads shaders at draw time or uses specific features like tessellation.

GloriousEggroll added a commit to GloriousEggroll/proton-ge-custom that referenced this issue Jan 18, 2023
Upstream DXVK has implemented the GraphicsPipelineLibrary (GPL) back in August, which takes over dxvk-async's job:

https://www.khronos.org/blog/reducing-draw-time-hitching-with-vk-ext-graphics-pipeline-library
doitsujin/dxvk#2798

Driver support for Nvidia was added in version 515.49.10:
https://developer.nvidia.com/vulkan-driver

Driver support for AMD/RADV was added in August 2022 and is an ongoing WIP:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17542

Per the above, it can be exposed for testing on AMD/RADV via:
RADV_PERFTEST=gpl

dxvk-async now causes problems with the dxvk-cache, which is not something new users may know how to clear:
Sporif/dxvk-async#55

In light of the above, in addition to the current rebase still not working properly, I am removing the dxvk-async patch from Proton-GE.

RIP dxvk-async 2018-2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants