Skip to content

Releases: 3Simplex/llama.cpp

b4254

03 Dec 19:41
91c36c2

Choose a tag to compare

server : (web ui) Various improvements, now use vite as bundler (#10599)

* hide buttons in dropdown menu

* use npm as deps manager and vite as bundler

* fix build

* fix build (2)

* fix responsive on mobile

* fix more problems on mobile

* sync build

* (test) add CI step for verifying build

* fix ci

* force rebuild .hpp files

* cmake: clean up generated files pre build

b4248

03 Dec 16:27
3b4f2e3

Choose a tag to compare

llama : add missing LLAMA_API for llama_chat_builtin_templates (#10636)

b4164

25 Nov 16:58
9ca2e67

Choose a tag to compare

server : add speculative decoding support (#10455)

* server : add speculative decoding support

ggml-ci

* server : add helper function slot.can_speculate()

ggml-ci

b4153

22 Nov 14:32
6dfcfef

Choose a tag to compare

ci: Update oneAPI runtime dll packaging (#10428)

This is the minimum runtime dll dependencies for oneAPI 2025.0

b4145

20 Nov 20:51
9abe9ee

Choose a tag to compare

vulkan: predicate max operation in soft_max shaders/soft_max (#10437)

Fixes #10434

b4132

19 Nov 15:20
3ee6382

Choose a tag to compare

cuda : fix CUDA_FLAGS not being applied (#10403)

b4125

18 Nov 17:14
531cb1c

Choose a tag to compare

Skip searching root path for cross-compile builds (#10383)

b4100

16 Nov 18:29
bcdb7a2

Choose a tag to compare

server: (web UI) Add samplers sequence customization (#10255)

* Samplers sequence: simplified and input field.

* Removed unused function

* Modify and use `settings-modal-short-input`

* rename "name" --> "label"

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>

b4067

12 Nov 14:48
54ef9cf

Choose a tag to compare

vulkan: Throttle the number of shader compiles during the build step.…

b4061

09 Nov 16:16
6423c65

Choose a tag to compare

metal : reorder write loop in mul mat kernel + style (#10231)

* metal : reorder write loop

* metal : int -> short, style

ggml-ci