-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[TRTLLM-6104] feat: add request_perf_metrics to triton LLMAPI backend
#5554
opened Jun 27, 2025 by
xuanzic
Loading…
[feat] Support MXFP4 x BF16 Grouped GEMM in FusedMoE Pytorch Module
#5552
opened Jun 27, 2025 by
jinyangyuan-nvidia
Loading…
feat: Optimize TRTLLM Sampler perf single beam single step
#5550
opened Jun 27, 2025 by
dcampora
Loading…
rcca: test default kv_cache_reuse option for pytorch multimodal
#5544
opened Jun 27, 2025 by
StanleySun639
Loading…
[nvbug 5304752][fix]: enhance _check_arguments to filter illegal requests for pytorch backend
#5541
opened Jun 27, 2025 by
LinPoly
Loading…
Refactor: move DeepEP from Docker images to wheel building
#5534
opened Jun 27, 2025 by
yuantailing
•
Draft
[enh] [GH/CI] [WIP] Auto-assign PR reviewers using module-owners information randomly
#5530
opened Jun 27, 2025 by
venkywonka
Loading…
[nvbugs/5302040] feat. Add whisper support (Bert Attention on SM100 and GPTAttention for cross attention on SM100)
#5527
opened Jun 26, 2025 by
wu6u3tw
Loading…
[nvbug/5337601][fix] Fix disagg + speculative decoding
#5525
opened Jun 26, 2025 by
mikeiovine
Loading…
[DRAFT] feat: transfer mm_data and refactor HyperCLOVAX & Qwen2/2.5-VL
#5522
opened Jun 26, 2025 by
yechank-nvidia
•
Draft
[feat] Add Tencent HunYuanMoEV1 model support
Community Engagement
help/insights needed from community
Community want to contribute
PRs initiated from Community
#5521
opened Jun 26, 2025 by
qianbiaoxiang
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.