Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Q4_0 scale selection using RMSE enhancement New feature or request Less than 4 bits Efforts related to viable quantized models using <4 bits research 🔬 Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#835 opened Apr 7, 2023 by sw Draft
Run several single thread operators parellel threading Parallel processing and thread management
#850 opened Apr 8, 2023 by howard0su
Use Threadpool to schedule the work threading Parallel processing and thread management
#851 opened Apr 8, 2023 by howard0su Draft
Add command mode to interactive mode. enhancement New feature or request Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#1022 opened Apr 17, 2023 by wbpxre150
llama : quantize attention results demo Demonstrate some concept or idea, not intended to be merged
#1103 opened Apr 21, 2023 by ggerganov Draft
main: add pledge call on OpenBSD
#1132 opened Apr 22, 2023 by codesoap
fix(LoRA): debugging
#1190 opened Apr 26, 2023 by jon-chuang Loading…
Getting started documentation
#1198 opened Apr 26, 2023 by TheNotary Loading…
Create run.py enhancement New feature or request obsolete? Marker for potentially obsolete PR python python script changes Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix script Script related
#1204 opened Apr 27, 2023 by jdpsl Loading…
Upgrade v1/v2 format to v3 by leveraging quantize
#1504 opened May 17, 2023 by howard0su Loading…
ci: add linux binaries to release build
#1505 opened May 17, 2023 by Green-Sky Loading…
ggml : spread compute across threads in chunks demo Demonstrate some concept or idea, not intended to be merged threading Parallel processing and thread management
#1507 opened May 17, 2023 by ggerganov Loading…
Llama cpp low level python bindings
#1660 opened Jun 1, 2023 by dmahurin Loading…
Added Arbitrary mixed quantization Less than 4 bits Efforts related to viable quantized models using <4 bits research 🔬
#1834 opened Jun 13, 2023 by Milkdrop Loading…
Disable _O_WTEXT when using main in MinGW
#1897 opened Jun 16, 2023 by asctime Loading…
Avoid unused constant warnings refactoring Refactoring Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#2029 opened Jun 28, 2023 by set-soft Loading…
try to fix compile warnings on macOS, address issue #2036
#2037 opened Jun 28, 2023 by mqy Loading…
llama : add llama_set_attn_type API examples
#12615 opened Mar 27, 2025 by ngxson Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.