Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync : llama.cpp #1098

Merged
merged 13 commits into from
Feb 3, 2025
Merged

sync : llama.cpp #1098

merged 13 commits into from
Feb 3, 2025

Conversation

ggerganov
Copy link
Member

No description provided.

jeffbolznv and others added 13 commits February 3, 2025 10:52
…lama/11436)

* vulkan: Catch pipeline creation failure and print an error message

Also, fix some warnings from my on-demand compile change.

* vulkan: fix pipeline creation logging
…a/11360)

* vulkan: initial support for IQ3_S

* vulkan: initial support for IQ3_XXS

* vulkan: initial support for IQ2_XXS

* vulkan: initial support for IQ2_XS

* vulkan: optimize Q3_K by removing branches

* vulkan: implement dequantize variants for coopmat2

* vulkan: initial support for IQ2_S

* vulkan: vertically realign code

* port failing dequant callbacks from mul_mm

* Fix array length mismatches

* vulkan: avoid using workgroup size before it is referenced

* tests: increase timeout for Vulkan llvmpipe backend

---------

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
* Use sccache on ci for windows

* Detect sccache in cmake
* CUDA: use mma PTX instructions for FlashAttention

* __shfl_sync workaround for movmatrix

* add __shfl_sync to HIP

Co-authored-by: Diego Devesa <slarengh@gmail.com>
…res for amd gpus are not supersets of eatch other (llama/11601)

This fixes a bug where RDNA1 gpus other than gfx1010 where not handled correctly
CUDA/HIP: add support for selectable warp size to mmv
@ggerganov ggerganov merged commit 498e0ec into master Feb 3, 2025
8 checks passed
@ggerganov ggerganov deleted the sync-llama.cpp-25-02-03 branch February 3, 2025 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants