-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Pull requests: Dao-AILab/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
check torch.is_grad_enabled before calling customer flash atten ops
#1397
opened Dec 19, 2024 by
XiaobingSuper
Loading…
Add hipBLAS/cuBLAS distinction in benchmark_gemm.py
#1393
opened Dec 17, 2024 by
garrettbyrd
Loading…
wrap func into torch ops to avoid torch.compile graphbreaks
#1333
opened Nov 13, 2024 by
kumarkrishna
Loading…
Promote wheels as alternative to pip install flash-attn
#1297
opened Oct 25, 2024 by
simonw
Loading…
fix: in newer versions of triton, tl.dot should take as input only q …
#1288
opened Oct 21, 2024 by
EdouardYvinec
Loading…
the test_flash_attn.py it's actually in parent directory
#1167
opened Aug 21, 2024 by
ArtificialZeng
Loading…
Add support for qk hidden dim different from v hidden dim
#1166
opened Aug 20, 2024 by
smallscientist1
Loading…
Fix: bwd may need to first allocate cuda mem for rng_state
#1077
opened Jul 20, 2024 by
jundaf2
Loading…
[Draft] support qk head_dim different from vo head_dim
#980
opened Jun 6, 2024 by
defei-coder
Loading…
Add local version identifier to package metadata for pre-built wheels
#856
opened Feb 28, 2024 by
yundai424
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2024-12-26.