Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add CFG to server
#2217 opened Jul 13, 2023 by SlyEcho Draft
1 of 4 tasks
Add support for Deepseek-R1 flash attention ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#11557 opened Jan 31, 2025 by siddartha-RE Review required
llama : support Jamba hybrid Transformer-Mamba models android Issues specific to Android embeddings embedding related topics enhancement New feature or request examples ggml changes relating to the ggml tensor library for machine learning model Model specific need feedback Testing and feedback with results are needed python python script changes refactoring Refactoring Review Complexity : High Generally require indepth knowledge of LLMs or GPUs server
#7531 opened May 25, 2024 by compilade Draft
7 of 17 tasks
Q4_0 scale selection using RMSE enhancement New feature or request Less than 4 bits Efforts related to viable quantized models using <4 bits research 🔬 Review Complexity : High Generally require indepth knowledge of LLMs or GPUs
#835 opened Apr 7, 2023 by sw Draft
Run several single thread operators parellel threading Parallel processing and thread management
#850 opened Apr 8, 2023 by howard0su Review required
Use Threadpool to schedule the work threading Parallel processing and thread management
#851 opened Apr 8, 2023 by howard0su Draft
Add command mode to interactive mode. enhancement New feature or request Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
#1022 opened Apr 17, 2023 by wbpxre150 Review required
llama : quantize attention results demo Demonstrate some concept or idea, not intended to be merged
#1103 opened Apr 21, 2023 by ggerganov Draft
fix(LoRA): debugging
#1190 opened Apr 26, 2023 by jon-chuang Review required
Create run.py enhancement New feature or request obsolete? Marker for potentially obsolete PR python python script changes Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix script Script related
#1204 opened Apr 27, 2023 by jdpsl Review required
ggml : spread compute across threads in chunks demo Demonstrate some concept or idea, not intended to be merged threading Parallel processing and thread management
#1507 opened May 17, 2023 by ggerganov Review required
Added Arbitrary mixed quantization Less than 4 bits Efforts related to viable quantized models using <4 bits research 🔬
#1834 opened Jun 13, 2023 by Milkdrop Review required
Avoid unused constant warnings refactoring Refactoring Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#2029 opened Jun 28, 2023 by set-soft Review required
ProTip! Add no:assignee to see everything that’s not assigned.