NVIDIA / TensorRT-Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 205
Star 1.6k

Code
Issues 68
Pull requests 39
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Pull requests: NVIDIA/TensorRT-Model-Optimizer

Labels 27 Milestones 0

New pull request New

39 Open 328 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[NVBUG: 5701937]Clear GPU cache for 3D weight tensors

#649 opened Dec 4, 2025 by cjluo-nv

Loading…

Fix BertSdpaSelfAttention quantization

#648 opened Dec 4, 2025 by kinjalpatel27

Loading…

Bump the pip group across 4 directories with 2 updates dependencies

Pull requests that update a dependency file

python

Pull requests that update python code

#646 opened Dec 4, 2025 by dependabot bot

Loading…

[OMNIML-2244] Create the MXFP8 quant exporter

#634 opened Dec 2, 2025 by ajrasane

Loading…

Added support for TElinear ops

#632 opened Dec 2, 2025 by kinjalpatel27

Loading…

Address security concerns in code

#626 opened Dec 2, 2025 by kevalmorabia97

Loading…

2 of 3 tasks

refactor: eagle config rewrite; update doc

#624 opened Dec 2, 2025 by h-guo18

Loading…

Updated dependencies in Windows examples

#622 opened Dec 1, 2025 by hthadicherla

Loading…

Optimize calibrate_draft_vocab to read only required lines when calib…

#618 opened Nov 27, 2025 by Ofir408

Loading…

MLA eagle for K2

#615 opened Nov 27, 2025 by h-guo18 • Draft

draft: Add per block MSE for NVFP4 and INT4

#613 opened Nov 27, 2025 by Fridah-nv • Draft

Convert compressed-tensor int4 format to GPTQ int4 format

#590 opened Nov 20, 2025 by Edwardf0t1

Loading…

Yeyu/remove embedding from eagle

#585 opened Nov 20, 2025 by yeyu-nvidia • Draft

Product Rename: TensorRT Model Optimizer to Model Optimizer

#583 opened Nov 20, 2025 by kevalmorabia97

Loading…

2 tasks done

support for newer checkpoints

#582 opened Nov 20, 2025 by binghanc • Draft

Feat: SGL backend for online SD training

#564 opened Nov 14, 2025 by h-guo18

Loading…

GPTQ Lite implementation

#555 opened Nov 13, 2025 by sugunav14

Loading…

1 of 2 tasks

[OMNIML-2850] [3/n] Adds sparse attention calibration

#538 opened Nov 11, 2025 by kaix-nv

Loading…

[OMNIML-2852] [2/n] Add Core Sparse Attention Infrastructure

#527 opened Nov 7, 2025 by kaix-nv

Loading…

parallel eagle draft

#523 opened Nov 6, 2025 by yeyu-nvidia • Draft

Support AWQ fake quant for vLLM MoE models

#521 opened Nov 6, 2025 by meenchen • Draft

[Draft] [5526696] Add kv cache quantization support for onnx quantization

#486 opened Oct 31, 2025 by zhanghaoc

Loading…

Yeyu/set block

#480 opened Oct 28, 2025 by yeyu-nvidia • Draft

Preserve original rope scaling type in export due to transformers library AutoConfig issue

#452 opened Oct 17, 2025 by Edwardf0t1

Loading…

[1/2] Registry interface for custom quantization functional backend

#449 opened Oct 17, 2025 by realAsma

Loading…

Previous 1 2 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!