Skip to content

Conversation

@jeffbolznv
Copy link
Collaborator

This should handle fusions that have a single output. If there's more than one output (like some of the topk stuff) then the one in the middle won't currently be checked. It should be possible to handle that as well by allocating and comparing multiple results, but I didn't do that yet.

I added a couple missing ops that I hit running test-backend-ops.

Indentation makes this diff look bigger than it really is, I'd suggest reviewing it in a better diff tool.

@jeffbolznv jeffbolznv requested a review from 0cc4m as a code owner November 1, 2025 17:47
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Nov 1, 2025
@am17an
Copy link
Collaborator

am17an commented Nov 2, 2025

Indentation makes this diff look bigger than it really is, I'd suggest reviewing it in a better diff tool.

you can append ?w=1 to the url to view the diff better. https://github.com/ggml-org/llama.cpp/pull/16919?w=1

@0cc4m
Copy link
Collaborator

0cc4m commented Nov 2, 2025

Very cool feature, thank you. It's also in the UI:
image

Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@0cc4m 0cc4m merged commit a44d771 into ggml-org:master Nov 5, 2025
69 of 72 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Nov 5, 2025
* origin/master: (21 commits)
vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion (ggml-org#16919)
examples(gguf): GGUF example outputs (ggml-org#17025)
mtmd: allow QwenVL to process larger image by default (ggml-org#17020)
server : do not default to multiple slots with speculative decoding (ggml-org#17017)
mtmd: improve struct initialization (ggml-org#16981)
docs: Clarify the endpoint that webui uses (ggml-org#17001)
model : add openPangu-Embedded (ggml-org#16941)
ggml webgpu: minor set rows optimization (ggml-org#16810)
sync : ggml
ggml : fix conv2d_dw SVE path (ggml/1380)
CUDA: update ops.md (ggml-org#17005)
opencl: update doc (ggml-org#17011)
refactor: replace sprintf with snprintf for safer string handling in dump functions (ggml-org#16913)
vulkan: remove the need for the dryrun (ggml-org#16826)
server : do context shift only while generating (ggml-org#17000)
readme : update hot topics (ggml-org#17002)
ggml-cpu : bicubic interpolation (ggml-org#16891)
ci : apply model label to models (ggml-org#16994)
chore : fix models indent after refactor (ggml-org#16992)
Fix garbled output with REPACK at high thread counts (ggml-org#16956)
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants