Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Deepseek-v3 running on 2xH100 nodes getting poor performanc bug Something isn't working
#2786 opened Feb 14, 2025 by zymy-chen
2 of 4 tasks
The performance of Qwen1.5-7B based on the trtllm-bench test was very poor bug Something isn't working
#2785 opened Feb 14, 2025 by ruru5697
3 of 4 tasks
Bug when loading an engine using LoRA through LLM API bug Something isn't working Investigating LLM API/Workflow triaged Issue has been triaged by maintainers
#2782 opened Feb 13, 2025 by pei0033
2 of 4 tasks
GPU Utilization drops gradually over time using Executor API bug Something isn't working
#2778 opened Feb 12, 2025 by MahmoudAshraf97
3 of 4 tasks
Inconsistent Batch Index Order in Decoupled Mode with trt-llm bug Something isn't working
#2777 opened Feb 12, 2025 by Oldpan
2 of 4 tasks
DeepSeek-V3 fp8 tp32 failed to convert chectpoint bug Something isn't working
#2776 opened Feb 12, 2025 by MtFitzRoy
2 of 4 tasks
Limit max GPU memory used
#2773 opened Feb 11, 2025 by bri25yu
Cannot create checkpoint for llama-3.2 (1B, 3B) bug Something isn't working
#2772 opened Feb 11, 2025 by falkbene
3 of 4 tasks
CUDA Illegal memory access for certain input sizes to Whisper bug Something isn't working
#2767 opened Feb 9, 2025 by MahmoudAshraf97
2 of 4 tasks
Mixtral SmoothQuant
#2759 opened Feb 7, 2025 by shana34
Building from source does not work bug Something isn't working Investigating triaged Issue has been triaged by maintainers
#2757 opened Feb 6, 2025 by maximzubkov
2 of 4 tasks
Experimental pytorch workflow question Further information is requested triaged Issue has been triaged by maintainers
#2750 opened Feb 6, 2025 by ttim
KV Cache quantization is not working with Whisper bug Something isn't working Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2748 opened Feb 5, 2025 by MahmoudAshraf97
3 of 4 tasks
fp8_rowwise and kv cache question Further information is requested triaged Issue has been triaged by maintainers
#2740 opened Feb 3, 2025 by vitipunto
DeepSeek-R1-Distill-Llama-70B int4 quantized version of the model generates garbage values bug Something isn't working triaged Issue has been triaged by maintainers
#2735 opened Feb 1, 2025 by kelkarn
2 of 4 tasks
Lora error while building tensorrt llm engine for mllama bug Something isn't working Investigating Lora/P-tuning triaged Issue has been triaged by maintainers
#2733 opened Feb 1, 2025 by nbowon
2 of 4 tasks
ProTip! Follow long discussions with comments:>50.