ggml-org / llama.cpp Public

Notifications
Fork 11.2k
Star 77.3k

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 72 Milestones 0

New pull request New

Clear current search query, filters, and sorts

404 Open 5,223 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add CFG to server

#2217 opened Jul 13, 2023 by SlyEcho • Draft

1 of 4 tasks

😄

Add support for Deepseek-R1 flash attention ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#11557 opened Jan 31, 2025 by siddartha-RE • Review required

😄

llama : support Jamba hybrid Transformer-Mamba models android

Issues specific to Android

embeddings

embedding related topics

enhancement

New feature or request

examples ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

need feedback

Testing and feedback with results are needed

python

python script changes

refactoring

Refactoring

Review Complexity : High

Generally require indepth knowledge of LLMs or GPUs

server

#7531 opened May 25, 2024 by compilade • Draft

7 of 17 tasks

😄

Q4_0 scale selection using RMSE enhancement

New feature or request

Less than 4 bits

Efforts related to viable quantized models using <4 bits

research 🔬 Review Complexity : High

Generally require indepth knowledge of LLMs or GPUs

#835 opened Apr 7, 2023 by sw • Draft

Run several single thread operators parellel threading

Parallel processing and thread management

#850 opened Apr 8, 2023 by howard0su • Review required

Use Threadpool to schedule the work threading

Parallel processing and thread management

#851 opened Apr 8, 2023 by howard0su • Draft

Add mmap pages stats (disabled by default)

#1015 opened Apr 16, 2023 by prusnak • Review required

Add command mode to interactive mode. enhancement

New feature or request

Review Complexity : Medium

Generally require more time to grok but manageable by beginner to medium expertise level

#1022 opened Apr 17, 2023 by wbpxre150 • Review required

Add a option to force the token end of text apears even on interative, and also shows loading porcentage

#1058 opened Apr 19, 2023 by jeffersoncgo • Review required

llama : quantize attention results demo

Demonstrate some concept or idea, not intended to be merged

#1103 opened Apr 21, 2023 by ggerganov • Draft

llama : add llama_set_attn_type API examples

#12615 opened Mar 27, 2025 by ngxson • Review required

fix(LoRA): debugging

#1190 opened Apr 26, 2023 by jon-chuang • Review required

Getting started documentation

#1198 opened Apr 26, 2023 by TheNotary • Review required

Create run.py enhancement

New feature or request

obsolete?

Marker for potentially obsolete PR

python

python script changes

Review Complexity : Low

Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix

script

Script related

#1204 opened Apr 27, 2023 by jdpsl • Review required

Add "-e"/"--eval-threads" to distinguish thread counts for single-token eval and prompt eval threading

Parallel processing and thread management

#744 opened Apr 3, 2023 by MagisterLuddite • Draft

[Research] Steering vectors research 🔬

#1472 opened May 16, 2023 by SlyEcho • Draft

Upgrade v1/v2 format to v3 by leveraging quantize

#1504 opened May 17, 2023 by howard0su • Review required

ggml : spread compute across threads in chunks demo

Demonstrate some concept or idea, not intended to be merged

threading

Parallel processing and thread management

#1507 opened May 17, 2023 by ggerganov • Review required

Llama cpp low level python bindings

#1660 opened Jun 1, 2023 by dmahurin • Review required

Added Arbitrary mixed quantization Less than 4 bits

Efforts related to viable quantized models using <4 bits

research 🔬

#1834 opened Jun 13, 2023 by Milkdrop • Review required

Disable _O_WTEXT when using main in MinGW

#1897 opened Jun 16, 2023 by asctime • Review required

Example work stealing chunked task allocator for issue #291

#2026 opened Jun 27, 2023 by mqy • Draft

Avoid unused constant warnings refactoring

Refactoring

Review Complexity : Low

Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix

#2029 opened Jun 28, 2023 by set-soft • Review required

try to fix compile warnings on macOS, address issue #2036

#2037 opened Jun 28, 2023 by mqy • Review required

Implement get_num_physical_cores() for Windows

#1278 opened May 2, 2023 by DannyDaemonic • Review required

Previous 1 2 3 4 5 … 16 17 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly