-
Notifications
You must be signed in to change notification settings - Fork 11.2k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add support for Deepseek-R1 flash attention
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#11557
opened Jan 31, 2025 by
siddartha-RE
•
Review required
llama : support Jamba hybrid Transformer-Mamba models
android
Issues specific to Android
embeddings
embedding related topics
enhancement
New feature or request
examples
ggml
changes relating to the ggml tensor library for machine learning
model
Model specific
need feedback
Testing and feedback with results are needed
python
python script changes
refactoring
Refactoring
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
server
Q4_0 scale selection using RMSE
enhancement
New feature or request
Less than 4 bits
Efforts related to viable quantized models using <4 bits
research 🔬
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
Run several single thread operators parellel
threading
Parallel processing and thread management
#850
opened Apr 8, 2023 by
howard0su
•
Review required
Add command mode to interactive mode.
enhancement
New feature or request
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#1022
opened Apr 17, 2023 by
wbpxre150
•
Review required
Add a option to force the token end of text apears even on interative, and also shows loading porcentage
#1058
opened Apr 19, 2023 by
jeffersoncgo
•
Review required
Create run.py
enhancement
New feature or request
obsolete?
Marker for potentially obsolete PR
python
python script changes
Review Complexity : Low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
script
Script related
#1204
opened Apr 27, 2023 by
jdpsl
•
Review required
Add "-e"/"--eval-threads" to distinguish thread counts for single-token eval and prompt eval
threading
Parallel processing and thread management
#744
opened Apr 3, 2023 by
MagisterLuddite
•
Draft
Upgrade v1/v2 format to v3 by leveraging quantize
#1504
opened May 17, 2023 by
howard0su
•
Review required
ggml : spread compute across threads in chunks
demo
Demonstrate some concept or idea, not intended to be merged
threading
Parallel processing and thread management
#1507
opened May 17, 2023 by
ggerganov
•
Review required
Added Arbitrary mixed quantization
Less than 4 bits
Efforts related to viable quantized models using <4 bits
research 🔬
#1834
opened Jun 13, 2023 by
Milkdrop
•
Review required
Avoid unused constant warnings
refactoring
Refactoring
Review Complexity : Low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#2029
opened Jun 28, 2023 by
set-soft
•
Review required
try to fix compile warnings on macOS, address issue #2036
#2037
opened Jun 28, 2023 by
mqy
•
Review required
Implement get_num_physical_cores() for Windows
#1278
opened May 2, 2023 by
DannyDaemonic
•
Review required
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.