Dao-AILab / flash-attention Public

Notifications You must be signed in to change notification settings
Fork 1.7k
Star 17.2k

Code
Issues 717
Pull requests 58
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: Dao-AILab/flash-attention

Beta

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

717 Open 636 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug] Incorrect Result using different num_splits

#1647 opened May 5, 2025 by ghostplant

how to know if I am using flash attention 3?

#1646 opened May 5, 2025 by sushruta

New binaries release needed for PyTorch 2.7.0 (torch2.7.0cu128 / torch2.6.0cu126 + flash_attn-2.7.4.post1 seem broken because PyTorch changed ABI)

#1644 opened May 4, 2025 by vadimkantorov

how to install in cuda11.3

#1643 opened May 4, 2025 by zhaoxiaolong2020

New RTX PRO 6000 support?

#1642 opened May 3, 2025 by WPR001

UVLTrack integrates FlashAttention2

#1641 opened May 3, 2025 by crashforyou

Anyone has Flash Attention 1.0.x wheels for windows? I need Flash Attention 1.0.x for RTX 2000 and older GPUs

#1640 opened May 2, 2025 by FurkanGozukara

Installation failes on ubuntu

#1639 opened May 1, 2025 by maticsandiego

Support for RTX 5090 (Compute Capability 12.0) in Flash-Attn

#1638 opened May 1, 2025 by chenguohao

[cutlass 3.9] when head_dim = 192, flash3 bwd failed with "RuntimeError: CUDA error: an illegal instruction was encountered"?

#1637 opened Apr 30, 2025 by henrylhtsang

Feature Request: Add FlashAttention-2 support for dandelin/vilt-b32-finetuned-vqa

#1636 opened Apr 30, 2025 by dasalazarb

CUDA_HOME Environment variable isn't set on ROCM

#1635 opened Apr 30, 2025 by TheAyes

Error: ModuleNotFoundError: No module named 'flash_attn_3_cuda'

#1633 opened Apr 30, 2025 by talha-10xE

Clarification on autotune using the triton backend for amd cards

#1632 opened Apr 30, 2025 by Kademo15

Long format error on Windows

#1631 opened Apr 29, 2025 by PuneethBC

How to determine the row block sizes for the Q/K/V matrices in cases?

#1630 opened Apr 29, 2025 by miaomiaoma0703

the installation process seems to hung there forever

#1628 opened Apr 28, 2025 by wenouyang

Dose FA3 support any page_size?

#1627 opened Apr 28, 2025 by HarryWu99

About flash-attn, LLaVA部署报错：ImportError: cannot import name ‘LlavaLlamaForCausalLM‘ from ‘llava.model‘ or Failed to build installable wheels for some pyproject.toml based projects (flash-attn)

#1625 opened Apr 27, 2025 by Jack1447

Different headdim question

#1623 opened Apr 27, 2025 by yinfan98

flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS

#1622 opened Apr 27, 2025 by PeanutInMay

RuntimeError: CUDA error: invalid configuration argument. When I use RMSNorm with zero dimension input

#1620 opened Apr 26, 2025 by Luciennnnnnn

[BUG] flash_attn_varlen_func do not support total seq_len smaller than batch size for GQA

#1619 opened Apr 26, 2025 by Luciennnnnnn

cutlass 3.9.0

#1617 opened Apr 25, 2025 by johnnynunez

Error when building FA2 on Windows using CUTLASS 3.9 on Torch 2.7.0 + CUDA 12.8

#1615 opened Apr 23, 2025 by Panchovix

Previous 1 2 3 4 5 … 28 29 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly