metal : avoid using Metal's gpuAddress property #16576

ggerganov · 2025-10-14T11:17:47Z

The gpuAddress property is not always available. Replace it with an atomic counter.

See: ggml-org/llama.cpp#16576 Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

* cuda : remove legacy copy-op pointer indirection code (ggml-org#16485) * remove legacy copy-op pointer indirection code * further removal of copy-op indirection code * renamed check_node_graph_compatibility_and_refresh_copy_ops function * CUDA: add fp kernel for larger batch size MoE (ggml-org#16512) * CUDA: kernel for larger batch sizes for MoE * WIP * WIP * WIP * WIP * WIP * WIP * fixup * tests * Move mmq_ids_helper to mmid * cleanup * Remove redundant checks * CUDA: use fastdiv + ggml_cuda_mad for mmvf (ggml-org#16557) * CUDA: use fastdiv + ggml_cuda_mad for mmvf * use bf16 directly + fix formatting * Add exception for HIP code * CUDA: enable FA for FP32 KV cache (ggml-org#16546) * vulkan: Improve build time for MSVC (ggml-org#16545) Enable CMP0147 so custom build steps (invoking vulkan-shader-gen) are run in parallel. Enable /MP so source files are compiled in parallel. * vulkan: Support FA with K/V in F32 (ggml-org#16543) * CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion (ggml-org#16577) * vulkan: Add ACC_TYPE_VEC2 implementation (ggml-org#16203) Signed-off-by: Stefan Savic <stefan.savic@huawei.com> Co-authored-by: Stefan Savic <stefan.savic@huawei.com> * metal : avoid using Metal's gpuAddress property (ggml-org#16576) * metal : avoid using Metal's gpuAddress property * metal : fix rope kernels buffer check --------- Signed-off-by: Stefan Savic <stefan.savic@huawei.com> Co-authored-by: Anav Prasad <anavp@nvidia.com> Co-authored-by: Aman Gupta <amangupta052@gmail.com> Co-authored-by: Johannes Gäßler <johannesg@5d6.de> Co-authored-by: Jeff Bolz <jbolz@nvidia.com> Co-authored-by: SavicStefan <50296686+SavicStefan@users.noreply.github.com> Co-authored-by: Stefan Savic <stefan.savic@huawei.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* metal : avoid using Metal's gpuAddress property * metal : fix rope kernels buffer check

* origin/master: Add server-driven parameter defaults and syncing (ggml-org#16515) metal: optimise `GGML_OP_SUM` (ggml-org#16559) server : fix img token logs (ggml-org#16595) llama-quant: add support for mmproj (ggml-org#16592) CUDA: Changing the CUDA scheduling strategy to spin (ggml-org#16585) server : fix mtmd checkpoints (ggml-org#16591) metal : avoid using Metal's gpuAddress property (ggml-org#16576) vulkan: Add ACC_TYPE_VEC2 implementation (ggml-org#16203) CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion (ggml-org#16577) vulkan: Support FA with K/V in F32 (ggml-org#16543) vulkan: Improve build time for MSVC (ggml-org#16545) CUDA: enable FA for FP32 KV cache (ggml-org#16546) CUDA: use fastdiv + ggml_cuda_mad for mmvf (ggml-org#16557) CUDA: add fp kernel for larger batch size MoE (ggml-org#16512) cuda : remove legacy copy-op pointer indirection code (ggml-org#16485) server : dynamic token limit for prompt cache (ggml-org#16560)

…)" This reverts commit fa882fd.

metal : avoid using Metal's gpuAddress property

a8d57d6

github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Oct 14, 2025

metal : fix rope kernels buffer check

84e3d8d

ggerganov mentioned this pull request Oct 14, 2025

metal: Change gpuAddress for contents #16565

Closed

jjerphan added a commit to jjerphan/llama.cpp-feedstock that referenced this pull request Oct 14, 2025

patch: metal : avoid using Metal's gpuAddress property

d78508d

See: ggml-org/llama.cpp#16576 Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

jjerphan mentioned this pull request Oct 14, 2025

patch: metal : avoid using Metal's gpuAddress property conda-forge/llama.cpp-feedstock#71

Merged

5 tasks

ggerganov merged commit fa882fd into master Oct 14, 2025
66 of 70 checks passed

ggerganov mentioned this pull request Oct 14, 2025

metal : use virtual GPU address for private buffers #15985

Closed

ggerganov deleted the gg/metal-avoid-gpu-address branch October 14, 2025 17:33

jjerphan added a commit to conda-forge/llama.cpp-feedstock that referenced this pull request Oct 14, 2025

patch: metal : avoid using Metal's gpuAddress property (#71)

9b15f63

See: ggml-org/llama.cpp#16576 Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

yael-works pushed a commit to yael-works/llama.cpp that referenced this pull request Oct 15, 2025

metal : avoid using Metal's gpuAddress property (ggml-org#16576)

b6185dc

* metal : avoid using Metal's gpuAddress property * metal : fix rope kernels buffer check

kyano pushed a commit to kyano/llama.cpp that referenced this pull request Oct 17, 2025

Revert "metal : avoid using Metal's gpuAddress property (ggml-org#16576…

a44b282

…)" This reverts commit fa882fd.

squidpickles mentioned this pull request Oct 17, 2025

Compilation issue on Apple Silicon MacOS Tahoe 26.0.1 - ggml-metal-device.m ggml-org/whisper.cpp#3480

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

metal : avoid using Metal's gpuAddress property #16576

metal : avoid using Metal's gpuAddress property #16576

ggerganov commented Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

metal : avoid using Metal's gpuAddress property #16576

metal : avoid using Metal's gpuAddress property #16576

Conversation

ggerganov commented Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant