-
Notifications
You must be signed in to change notification settings - Fork 581
Pull requests: flashinfer-ai/flashinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Use global TuningConfig, to fix memory leak caused by AutoTuner LRU cache and dynamic lambda TuningConfig
#2140
opened Nov 24, 2025 by
juju812
Loading…
5 tasks done
feat: add trtllm-gen per-tensor sparseMla kernels.
#2138
opened Nov 24, 2025 by
PerkzZheng
Loading…
5 tasks done
fix: some bugs of headDim 256 trtllm-gen fmha kernels.
#2137
opened Nov 24, 2025 by
PerkzZheng
Loading…
5 tasks done
fix(trtllm): reset negative strideBatch to 0 for ragged KV layout to …
#2134
opened Nov 23, 2025 by
YAMY1234
Loading…
5 tasks done
feat: add seed offset args to sampler to allow cuda graph support
#2132
opened Nov 23, 2025 by
ksukrit
Loading…
5 tasks done
A unified API for the MNNVL and single-node AllReduce kernels.
#2130
opened Nov 21, 2025 by
nvmbreughe
•
Draft
5 tasks
[wip] feat: support variable sequence length in decode kernel of trtllm-gen attention
#2125
opened Nov 20, 2025 by
yaoyaoding
•
Draft
5 tasks
perf: using multi-cta optimization for top-k/top-p
#2119
opened Nov 20, 2025 by
yzh119
Loading…
4 of 5 tasks
refactor: update fa3 codebase and fix hopper unittest [part 1]
#2111
opened Nov 19, 2025 by
yzh119
Loading…
4 of 5 tasks
feat: support more head dim in RoPE kernel
#2109
opened Nov 19, 2025 by
raayandhar
Loading…
5 tasks done
make DeepGEMM swapAB available for linear gemm SM90
#2101
opened Nov 17, 2025 by
xuanzic
Loading…
5 tasks
refactor: pass hopper deepgemm include directory through python
#2090
opened Nov 14, 2025 by
yzh119
Loading…
4 of 5 tasks
[To merge AFTER flashinfer-ci changes updated] Reduce test time by moving compilation off-line
#2089
opened Nov 14, 2025 by
kahyunnam
Loading…
5 tasks done
feat: BF16 GEMM using CUTLASS backend for SM100
#2070
opened Nov 10, 2025 by
raayandhar
Loading…
5 tasks done
Rebase FP8 SM100 Cutlass FMHA Attention to main (original PR#1238)
#2047
opened Nov 5, 2025 by
pavanimajety
•
Draft
5 tasks
Refactor flashinfer/__init__.py so that applications could selectively pack submodules without modifying __init__.py
#2027
opened Nov 3, 2025 by
bangshengtang
Loading…
5 tasks done
refactor: backend_requirement + supported_compute_capability decorator for gemm
#2000
opened Oct 29, 2025 by
jimmyzho
Loading…
5 tasks
chore: agentic workflow for automatic version bump
#1947
opened Oct 19, 2025 by
yzh119
Loading…
5 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.