-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Issues: Dao-AILab/flash-attention
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Feature Request: Add FlashAttention-2 support for dandelin/vilt-b32-finetuned-vqa
#1636
opened Apr 30, 2025 by
dasalazarb
Error: ModuleNotFoundError: No module named 'flash_attn_3_cuda'
#1633
opened Apr 30, 2025 by
talha-10xE
Clarification on autotune using the triton backend for amd cards
#1632
opened Apr 30, 2025 by
Kademo15
How to determine the row block sizes for the Q/K/V matrices in cases?
#1630
opened Apr 29, 2025 by
miaomiaoma0703
flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS
#1622
opened Apr 27, 2025 by
PeanutInMay
RuntimeError: CUDA error: invalid configuration argument. When I use RMSNorm with zero dimension input
#1620
opened Apr 26, 2025 by
Luciennnnnnn
[BUG] flash_attn_varlen_func do not support total seq_len smaller than batch size for GQA
#1619
opened Apr 26, 2025 by
Luciennnnnnn
Error when building FA2 on Windows using CUTLASS 3.9 on Torch 2.7.0 + CUDA 12.8
#1615
opened Apr 23, 2025 by
Panchovix
Previous Next
ProTip!
Follow long discussions with comments:>50.