vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion #16919

jeffbolznv · 2025-11-01T17:47:32Z

This should handle fusions that have a single output. If there's more than one output (like some of the topk stuff) then the one in the middle won't currently be checked. It should be possible to handle that as well by allocating and comparing multiple results, but I didn't do that yet.

I added a couple missing ops that I hit running test-backend-ops.

Indentation makes this diff look bigger than it really is, I'd suggest reviewing it in a better diff tool.

am17an · 2025-11-02T05:14:38Z

Indentation makes this diff look bigger than it really is, I'd suggest reviewing it in a better diff tool.

you can append ?w=1 to the url to view the diff better. https://github.com/ggml-org/llama.cpp/pull/16919?w=1

0cc4m · 2025-11-02T06:16:09Z

Very cool feature, thank you. It's also in the UI:

0cc4m

Thank you!

* origin/master: (21 commits) vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion (ggml-org#16919) examples(gguf): GGUF example outputs (ggml-org#17025) mtmd: allow QwenVL to process larger image by default (ggml-org#17020) server : do not default to multiple slots with speculative decoding (ggml-org#17017) mtmd: improve struct initialization (ggml-org#16981) docs: Clarify the endpoint that webui uses (ggml-org#17001) model : add openPangu-Embedded (ggml-org#16941) ggml webgpu: minor set rows optimization (ggml-org#16810) sync : ggml ggml : fix conv2d_dw SVE path (ggml/1380) CUDA: update ops.md (ggml-org#17005) opencl: update doc (ggml-org#17011) refactor: replace sprintf with snprintf for safer string handling in dump functions (ggml-org#16913) vulkan: remove the need for the dryrun (ggml-org#16826) server : do context shift only while generating (ggml-org#17000) readme : update hot topics (ggml-org#17002) ggml-cpu : bicubic interpolation (ggml-org#16891) ci : apply model label to models (ggml-org#16994) chore : fix models indent after refactor (ggml-org#16992) Fix garbled output with REPACK at high thread counts (ggml-org#16956) ...

vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion

e565eb9

jeffbolznv requested a review from 0cc4m as a code owner November 1, 2025 17:47

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Nov 1, 2025

jeffbolznv mentioned this pull request Nov 3, 2025

vulkan: fuse rms_norm + mul + rope (+ view + set_rows) #16977

Open

0cc4m approved these changes Nov 5, 2025

View reviewed changes

0cc4m merged commit a44d771 into ggml-org:master Nov 5, 2025
69 of 72 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion #16919

vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion #16919

jeffbolznv commented Nov 1, 2025

Uh oh!

am17an commented Nov 2, 2025 •

edited

Loading

Uh oh!

0cc4m commented Nov 2, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion #16919

vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion #16919

Conversation

jeffbolznv commented Nov 1, 2025

Uh oh!

am17an commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0cc4m commented Nov 2, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

am17an commented Nov 2, 2025 •

edited

Loading