Skip to content

Conversation

ggerganov
Copy link
Member

No description provided.

@ggerganov ggerganov requested a review from CISC as a code owner September 23, 2025 07:32
@github-actions github-actions bot added the devops improvements to build systems and github actions label Sep 23, 2025
Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggerganov
Copy link
Member Author

Looks good, but no idea what's going on here: https://github.com/ggml-org/llama.cpp/actions/runs/17938949707/job/51010682770?pr=16194#step:3:4196

I'm not sure either.

@0cc4m Let me know if you are interested in adding MoltenVK workflow to the CI. Atm, it seems to require some fixes in the IM2COL and CONV kernels to be able to run correctly.

Feel free to push into this branch if you find a fix.

@ericcurtin
Copy link
Collaborator

Do people have hope KosmicKrisp will be even better than MoltenVK?

Tagging @kpouget here, I think he built this on macOS before.

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 23, 2025

@ggerganov we reported that problem to MoltenVK and it was fixed in 1.4. Are you sure the MVK version is up to date?

@kpouget
Copy link

kpouget commented Sep 23, 2025

@0cc4m Let me know if you are interested in adding MoltenVK workflow to the CI. Atm, it seems to require some fixes in the IM2COL and CONV kernels to be able to run correctly.

I have a nightly performance testing running everyday, and it builds correctly the Vulkan version

31/36 Test #31: test-backend-ops ..................***Exception: SegFault 33.16 sec

hum but I see now that it's the test-backend-ops that failed, this one isn't enabled in my nightly. I'll give it a shot to see if I have the same failure

@ggerganov
Copy link
Member Author

@0cc4m The runner is using the v1.4.321.0 SDK from this site: https://vulkan.lunarg.com/sdk/home. I think this is the latest version.

ggml@ggml-100-mac-m4 ~ % vulkaninfo --summary 
==========
VULKANINFO
==========

Vulkan Instance Version: 1.4.321


Instance Extensions: count = 17
-------------------------------
VK_EXT_debug_report                    : extension revision 10
VK_EXT_debug_utils                     : extension revision 2
VK_EXT_headless_surface                : extension revision 1
VK_EXT_layer_settings                  : extension revision 2
VK_EXT_metal_surface                   : extension revision 1
VK_EXT_surface_maintenance1            : extension revision 1
VK_EXT_swapchain_colorspace            : extension revision 5
VK_KHR_device_group_creation           : extension revision 1
VK_KHR_external_fence_capabilities     : extension revision 1
VK_KHR_external_memory_capabilities    : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2       : extension revision 1
VK_KHR_portability_enumeration         : extension revision 1
VK_KHR_surface                         : extension revision 25
VK_LUNARG_direct_driver_loading        : extension revision 1
VK_MVK_macos_surface                   : extension revision 3

Instance Layers: count = 7
--------------------------
VK_LAYER_KHRONOS_profiles         Khronos Profiles layer                     1.4.321  version 1
VK_LAYER_KHRONOS_shader_object    Khronos Shader object layer                1.4.321  version 1
VK_LAYER_KHRONOS_synchronization2 Khronos Synchronization2 layer             1.4.321  version 1
VK_LAYER_KHRONOS_validation       Khronos Validation Layer                   1.4.321  version 1
VK_LAYER_LUNARG_api_dump          LunarG API dump layer                      1.4.321  version 2
VK_LAYER_LUNARG_gfxreconstruct    GFXReconstruct Capture Layer Version 1.0.5 1.4.321  version 4194309
VK_LAYER_LUNARG_screenshot        LunarG image capture layer                 1.4.321  version 1

Devices:
========
GPU0:
	apiVersion         = 1.3.313
	driverVersion      = 0.2.2108
	vendorID           = 0x106b
	deviceID           = 0xf010209
	deviceType         = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
	deviceName         = Apple M4
	driverID           = DRIVER_ID_MOLTENVK
	driverName         = MoltenVK
	driverInfo         = 1.3.0
	conformanceVersion = 1.3.8.0
	deviceUUID         = 0000106b-0f01-0209-0000-000000000000
	driverUUID         = 4d564b00-0000-283c-0f01-020900000000

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 23, 2025

I'm using a homebrew version and it's just fine. Seems to be 2 minor versions further, though:

Devices:
========
GPU0:
        apiVersion         = 1.4.323
        driverVersion      = 0.2.2208
        vendorID           = 0x106b
        deviceID           = 0xf060209
        deviceType         = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
        deviceName         = Apple M4 Max
        driverID           = DRIVER_ID_MOLTENVK
        driverName         = MoltenVK
        driverInfo         = 1.4.0
        conformanceVersion = 1.4.2.0
        deviceUUID         = 0000106b-0f06-0209-0000-000000000000
        driverUUID         = 4d564b00-0000-28a0-0f06-020900000000

With that, it passes test-backend-ops fully.

@0cc4m
Copy link
Collaborator

0cc4m commented Sep 23, 2025

Actually no, your version is far behind:

driverName         = MoltenVK
driverInfo         = 1.3.0

That is the problem.

@ggerganov ggerganov merged commit 0889589 into master Sep 23, 2025
103 of 113 checks passed
@ggerganov
Copy link
Member Author

Thanks, that fixed it.

@ggerganov ggerganov deleted the gg/ci-mac-vulkan branch September 23, 2025 10:44
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Sep 23, 2025
* origin/master: (39 commits)
ci : disable AMD workflows + update NVIDIA workflows (ggml-org#16200)
ci : enable Vulkan workflow on Mac (ggml-org#16194)
ggml-cpu: Respect cpumask settings (ggml-org#16164)
ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (ggml-org#15928)
zdnn: refactor codebase + add docs (ggml-org#16178)
codeowners : add @danbev to model-conversion example [no ci] (ggml-org#16190)
devops: add s390x containers (ggml-org#15915)
ggml-cpu : fix typo in gemm comments [no ci] (ggml-org#16189)
feat: Add conversion support in GraniteHybrid for non-hybrid (all attn) (ggml-org#16177)
clang-tidy : disable warning about performance enum size (ggml-org#16127)
ggml : implement set_rows with i32 index (ggml-org#16159)
codeowners : update + cleanup (ggml-org#16174)
common : enable `--offline` mode without curl support (ggml-org#16137)
webui : fix handling incomplete chunks (ggml-org#16107)
embedding : fix typos in README (ggml-org#16171)
common : remove unused local variables (ggml-org#16140)
ggml : extend ggml_can_fuse to work with non-sequential nodes (ggml-org#16123)
ggml : add ggml_op_is_empty (ggml-org#16122)
codeowners : update ownership for @ngxson and @allozuar (ggml-org#16128)
Vulkan: add conv_transpose_2d operation (ggml-org#16022)
...
pwilkin pushed a commit to pwilkin/llama.cpp that referenced this pull request Sep 25, 2025
struct pushed a commit to struct/llama.cpp that referenced this pull request Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops improvements to build systems and github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants