merge from upstream #69

l3utterfly · 2025-06-03T08:15:41Z

No description provided.

ggml-ci

The goal is to have what users call "full logs" contain the backtrace. This is registered upon ggml_init. Also fixes a minor fd leak on Linux.

* ggml : install dynamic backends Make sure dynamic backends are installed in $CMAKE_INSTALL_BINDIR

* ggml : Fix backtrace breaking Windows build (whisper/3203) * sync : whisper.cpp ggml-ci --------- Co-authored-by: Daniel Tang <danielzgtg.opensource@gmail.com>

…gml/1247) The implementation is already deleted with commit 9d0762e. closes: ggml-org#1235

ggml-ci

* gguf: prevent non-native endian models from being loaded Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * gguf: update error message Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * gguf: make the non-native endian check more verbose Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml: move ggml_assert location Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml: reword the endianness check error message Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> --------- Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

…gml-org#13826) * [WIP]: fuse q8 quantization and reorder * wip2: fuse q8 quantization and reorder * working q8 reorder commit * restored common.hpp * remove debug prints * remove unnecessary headers and remove trailing whitespace * Update ggml/src/ggml-sycl/ggml-sycl.cpp Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com> --------- Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>

…l-org#13966) Some systems report the CPU implementation as "Power11" instead of "POWER11". The existing CMake logic uses a case-sensitive regular expression to extract the CPU generation, which fails when the casing doesn't exactly match "POWER". This patch provides a fix by first converting the string to uppercase before applying the regex. Signed-off-by: root <root@rheldb2v.pperf.tadn.ibm.com> Co-authored-by: root <root@rheldb2v.pperf.tadn.ibm.com>

* mtmd : fix memory in mtmd_helper_eval_chunk_single * mtmd-cli : fix mem leak * Update tools/mtmd/mtmd-cli.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

… diffs) (ggml-org#13933) * server: update deepseek reasoning format (now in reasoning_content diffs), add legacy option for compat * update unit/test_tool_call.py::test_thoughts

* gemma : fix attn scale for 27B * cont : apply scale before attn * cont : consistent attention scaling

ggml-ci

* server : use swa-full fo draft context ggml-ci * server : disable speculative decoding for SWA models

…3840) * add concat, pad, repeat, tsembd, tanh, upscale * small fixes

* This is not needed by the normal use where the result is read using `tensor_get`, but it allows perf mode of `test-backend-ops` to properly measure performance.

ggerganov and others added 23 commits June 1, 2025 11:39

kv-cache : split implementation in separate sources (ggml-org#13920)

0fc16b4

ggml-ci

parallel : fix n_junk == 0 (ggml-org#13952)

c046217

readme : update bindings (ggml-org#13950)

8726392

ggml : Print backtrace on uncaught C++ exceptions (ggml/1232)

fedf034

The goal is to have what users call "full logs" contain the backtrace. This is registered upon ggml_init. Also fixes a minor fd leak on Linux.

ggml : install dynamic backends (ggml/1240)

6eba72b

* ggml : install dynamic backends Make sure dynamic backends are installed in $CMAKE_INSTALL_BINDIR

sync : whisper.cpp (ggml/1250)

a7b8d35

* ggml : Fix backtrace breaking Windows build (whisper/3203) * sync : whisper.cpp ggml-ci --------- Co-authored-by: Daniel Tang <danielzgtg.opensource@gmail.com>

ggml : remove ggml_graph_import and ggml_graph_export declarations (g…

af6f91d

…gml/1247) The implementation is already deleted with commit 9d0762e. closes: ggml-org#1235

cmake : Fix broken CMake error messages (ggml/1252)

d337252

vulkan : Remove unexpected ; (ggml/1253)

108009f

sync : ggml

f3a4b16

ggml-ci

convert : fix vocab padding code for bert models (ggml-org#13954)

c496fe0

convert : fix nomic-bert-moe mask token (ggml-org#13757)

5e1c3ae

gguf: fix failure on version == 0 (ggml-org#13956)

7675c55

server: update deepseek reasoning format (pass reasoning_content as…

c9bbc77

… diffs) (ggml-org#13933) * server: update deepseek reasoning format (now in reasoning_content diffs), add legacy option for compat * update unit/test_tool_call.py::test_thoughts

gemma : more consistent attention scaling for v2 and v3 (ggml-org#13951)

5582c49

* gemma : fix attn scale for 27B * cont : apply scale before attn * cont : consistent attention scaling

metal : use F32 accumulators in FA kernels (ggml-org#13975)

ea394d7

ggml-ci

server : disable speculative decoding for SWA models (ggml-org#13970)

3637576

* server : use swa-full fo draft context ggml-ci * server : disable speculative decoding for SWA models

OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat (ggml-org#1…

bfb1e01

…3840) * add concat, pad, repeat, tsembd, tanh, upscale * small fixes

opencl: add backend_synchronize (ggml-org#13939)

71e74a3

* This is not needed by the normal use where the result is read using `tensor_get`, but it allows perf mode of `test-backend-ops` to properly measure performance.

github-actions bot added SYCL Vulkan testing examples python server ggml labels Jun 3, 2025

github-actions bot added Apple Metal script labels Jun 3, 2025

Merge branch 'layla-build' into merge

7c054e2

l3utterfly merged commit 2d10c89 into layla-build Jun 3, 2025
22 of 48 checks passed

l3utterfly deleted the merge branch June 3, 2025 08:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge from upstream #69

merge from upstream #69

Uh oh!

l3utterfly commented Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!

merge from upstream #69

merge from upstream #69

Uh oh!

Conversation

l3utterfly commented Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!