Skip to content

Releases: arthw/llama.cpp

b3555

07 Aug 16:34
Compare
Choose a tag to compare
fix error

b3554

07 Aug 16:26
Compare
Choose a tag to compare
ggml-backend : fix async copy from CPU (#8897)

* ggml-backend : fix async copy from CPU

* cuda : more reliable async copy, fix stream used when the devices are the same

b3517

02 Aug 05:48
Compare
Choose a tag to compare
[SYCL] Fixing wrong VDR iq4nl value (#8812)

b3482

01 Aug 06:57
c16f01b
Compare
Choose a tag to compare
Merge pull request #2 from arthw/refactor_dev

Refactor device management and usage api

b3475

27 Jul 14:49
Compare
Choose a tag to compare
llama : add support for llama 3.1 rope scaling factors (#8676)

* Add llama 3.1 rope scaling factors to llama conversion and inference

This commit generates the rope factors on conversion and adds them to the resulting model as a tensor. At inference time, these factors are passed to the `ggml_rope_ext` rope oepration, improving results for context windows above 8192

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <git@compilade.net>

* address comments

* address comments

* Update src/llama.cpp

Co-authored-by: compilade <git@compilade.net>

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <git@compilade.net>

---------

Co-authored-by: compilade <git@compilade.net>

b3388

14 Jul 04:09
Compare
Choose a tag to compare
fix UT of concat

b3387

13 Jul 18:00
Compare
Choose a tag to compare
mv softmax to separated file

b3313

13 Jul 17:38
Compare
Choose a tag to compare
fix for multiple cards

b3312

13 Jul 09:34
aeaed61
Compare
Choose a tag to compare
Merge pull request #1 from arthw/update_warp

[SYCL] Fix WARP_SIZE=16 bug of Intel GPU (#8266) cherry-pick b549a1bbefb2f1fbb8b558bac1f2ae7967e60964

b3309

07 Jul 14:23
Compare
Choose a tag to compare
py : switch to snake_case (#8305)

* py : switch to snake_case

ggml-ci

* cont

ggml-ci

* cont

ggml-ci

* cont : fix link

* gguf-py : use snake_case in scripts entrypoint export

* py : rename requirements for convert_legacy_llama.py

Needed for scripts/check-requirements.sh

---------

Co-authored-by: Francis Couture-Harpin <git@compilade.net>