Skip to content

CUDA: use mma PTX instructions for FlashAttention #19018

CUDA: use mma PTX instructions for FlashAttention

CUDA: use mma PTX instructions for FlashAttention #19018

Annotations

1 warning

windows-latest-cmake (avx-x64, -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_AVX...

succeeded Feb 2, 2025 in 4m 45s