Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable GL_KHR_cooperative_matrix Vulkan extension if not available. #11117

Merged
merged 3 commits into from
Jan 8, 2025

Conversation

mbaudier
Copy link
Contributor

@mbaudier mbaudier commented Jan 7, 2025

Allows to support Vulkan on platforms where the GL_KHR_cooperative_matrix extension is not available, such as Debian 12/bookworm.

Resolves #11052.

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Jan 7, 2025
@0cc4m 0cc4m self-requested a review January 7, 2025 06:22
Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it and it's fine on systems with versions new enough to support the extension.

The code looks good, I only have two minor comments. Once those are addressed I'll merge it. Thank you for the contribution!

ggml/src/ggml-vulkan/CMakeLists.txt Outdated Show resolved Hide resolved
ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated Show resolved Hide resolved
@jeffbolznv
Copy link
Collaborator

LGTM. Yet another reason we should consider building glslc from a known good commit.

@0cc4m 0cc4m merged commit 02f0430 into ggerganov:master Jan 8, 2025
48 checks passed
@LostRuins
Copy link
Collaborator

LostRuins commented Jan 9, 2025

Hi, I'm having some issues after this PR, not sure if it's a user error on my part.

Scenario: My glslc and vulkan install supports coopmat1. However, I wish to produce a build that excludes coopmat as if I did not have it.

Approach: I am manually setting GGML_VULKAN_COOPMAT_GLSLC_SUPPORT to not defined, meanwhile VK_KHR_cooperative_matrix remains set to 1 from the headers in the latest installed Vulkan SDK.

Result: Builds fine. However at runtime during ggml_vulkan: Compiling shaders I get a segmentation fault. Running a backtrace shows it happens at

0x00007ffe88eb361b in vk_gr2608GetInstanceProcAddr () from C:\Windows\System32\DriverStore\FileRepository\nvmii.inf_amd64_881a68749106a57c\nvoglv64.dll

Now if I define GGML_VULKAN_COOPMAT_GLSLC_SUPPORT, then everything continues to work as per normal, but it is my understanding that I should be able to toggle this flag rather than be forced to use it with the latest Vulkan SDK on windows?

To reproduce this: Have a system that supports coopmat, install the latest Vulkan SDK on windows, then remove this line

add_compile_definitions(GGML_VULKAN_COOPMAT_GLSLC_SUPPORT)

@0cc4m
Copy link
Collaborator

0cc4m commented Jan 9, 2025

Now if I define GGML_VULKAN_COOPMAT_GLSLC_SUPPORT, then everything continues to work as per normal, but it is my understanding that I should be able to toggle this flag rather than be forced to use it with the latest Vulkan SDK on windows?

No, the goal is just to ensure compatibility with Debian (or other Linux distros) where the package manager versions of glslc and the Vulkan headers are too old to support the VK_KHR_cooperative_matrix extension. On systems without a package manager (like Windows) you can (and should) just install the latest SDK.

I see wrong results when I force disable GGML_VULKAN_COOPMAT_GLSLC_SUPPORT, but no crash. This could be fixed, but it's not a case that is likely to happen on any system, since it requires a mismatch between what glslc supports and what the Vulkan headers support.

@mbaudier
Copy link
Contributor Author

mbaudier commented Jan 9, 2025

To put it slightly differently, GGML_VULKAN_COOPMAT_GLSLC_SUPPORT is not meant to be set manually like the usual GGML_* flags, but to be automatically set when an older Vulkan SDK is detected. It is actually an internal flag, maybe it should be clearer in its name?

@LostRuins
Copy link
Collaborator

update: checked with occam and this worked for me

diff --git a/ggml/src/ggml-vulkan/ggml-vulkan.cpp b/ggml/src/ggml-vulkan/ggml-vulkan.cpp
index 07745242..bc6ff4a3 100644
--- a/ggml/src/ggml-vulkan/ggml-vulkan.cpp
+++ b/ggml/src/ggml-vulkan/ggml-vulkan.cpp
@@ -2184,6 +2184,10 @@ static vk_device ggml_vk_get_device(size_t idx) {
             device->coopmat_support = false;
         }

+#if !defined(GGML_VULKAN_COOPMAT_GLSLC_SUPPORT)
+        device->coopmat_support = false;
+#endif
+
         std::vector<vk::QueueFamilyProperties> queue_family_props = device->physical_device.getQueueFamilyProperties();

         // Try to find a non-graphics compute queue and transfer-focused queues

allows me to build as if i don't have coopmat, no crash and output seems coherent, at non-coopmat speeds, output binary is 30% smaller.

But he did mention its not a supported use case, instead I should use GGML_VK_DISABLE_COOPMAT to disable at runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compile bug: Vulkan shaders not compiling any more on Debian Stable (12/bookworm)
4 participants