CUDA: fix Volta FlashAttention logic #11615

JohannesGaessler · 2025-02-03T10:36:21Z

The problem is that the kernel selection logic is wrong. The "tile" and "vec" kernels should be used when no tensor cores are available at all, not when the new tensor core type is unavailable.

ggerganov · 2025-02-03T11:03:13Z

I'm now getting this error:

/home/ggml/work/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-mma-f16.cuh:453: ERROR: CUDA kernel flash_attn_ext_f16_process_tile has no device code compatible with CUDA arch 700. ggml-cuda.cu was compiled for: 700

JohannesGaessler · 2025-02-03T11:48:49Z

The code seems to have also been missing a return statement.

ggerganov

It works now.

Author : Johannes Gaessler

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Feb 3, 2025

JohannesGaessler requested a review from ggerganov February 3, 2025 10:36

CUDA: fix Volta FlashAttention logic

5ee63ee

JohannesGaessler force-pushed the cuda-fa-fix-logic branch from ff0d3f6 to 5ee63ee Compare February 3, 2025 11:48

ggerganov approved these changes Feb 3, 2025

View reviewed changes

ggerganov merged commit 21c84b5 into ggml-org:master Feb 3, 2025
46 checks passed

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Feb 4, 2025

CUDA: fix Volta FlashAttention logic (ggml-org#11615)

2661a07

Author : Johannes Gaessler

tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025

CUDA: fix Volta FlashAttention logic (ggml-org#11615)

518144e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: fix Volta FlashAttention logic #11615

CUDA: fix Volta FlashAttention logic #11615

JohannesGaessler commented Feb 3, 2025

ggerganov commented Feb 3, 2025

JohannesGaessler commented Feb 3, 2025

ggerganov left a comment

CUDA: fix Volta FlashAttention logic #11615

CUDA: fix Volta FlashAttention logic #11615

Conversation

JohannesGaessler commented Feb 3, 2025

ggerganov commented Feb 3, 2025

JohannesGaessler commented Feb 3, 2025

ggerganov left a comment

Choose a reason for hiding this comment