Skip to content

Conversation

@ajrasane
Copy link
Contributor

@ajrasane ajrasane commented Dec 2, 2025

What does this PR do?

Type of change:
New feature

Overview: ?

  • Implemented functions for the MXFP8 quant exporter
  • Integrated autocast for converting model to fp16
  • deprecated quantize_weights_to_mxfp8
  • Updated tests

Usage

python torch_quant_to_onnx.py --quantize_mode=mxfp8 --onnx_save_path=vit_base_patch16_224.mxfp8.onnx --calibration_data_size 64 --batch_size 128

Testing

python evaluate.py --onnx_path=vit_base_patch16_224.mxfp8.onnx --model_name=vit_base_patch16_224 --results_path=./results.txt --batch_size 128

Accuracy and latency results

The top1 accuracy of the model is 85.07%
The top5 accuracy of the model is 97.558%

Reference accuracy for fp16

The top1 accuracy of the model is 85.102%
The top5 accuracy of the model is 97.526%

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: No
  • deprecated quantize_weights_to_mxfp8
  • Did you write any new necessary tests?: No
  • Did you add or update any necessary documentation?: No
  • Did you update Changelog?: No

@ajrasane ajrasane requested review from a team as code owners December 2, 2025 21:34
@ajrasane ajrasane requested a review from cjluo-nv December 2, 2025 21:34
@codecov
Copy link

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 74.50%. Comparing base (3ef9e39) to head (92ff47a).

Files with missing lines Patch % Lines
modelopt/torch/_deploy/utils/torch_onnx.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #634      +/-   ##
==========================================
- Coverage   74.57%   74.50%   -0.07%     
==========================================
  Files         183      183              
  Lines       18451    18400      -51     
==========================================
- Hits        13759    13709      -50     
+ Misses       4692     4691       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ajrasane ajrasane requested a review from a team as a code owner December 3, 2025 02:28
timeout-minutes: 120
container: &gpu_container
image: nvcr.io/nvidia/pytorch:25.06-py3
image: nvcr.io/nvidia/pytorch:25.08-py3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

25.08 is CUDA 13 container. ort-gpu installation by default is for cuda 12, is that fine? We also install cupy-cuda12x instead of cupy-cuda13x in setup.py for int4 onnx quantization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are issues with using autocast with the TensorRT verison. I will disable it for now for mxfp8.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase to latest main branch so all tests can run in github for this PR. CICD is now migrated to .github/workflows (except onnx_ptq bash test).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ajrasane ajrasane force-pushed the ajrasane/mxfp8_export branch from f0103e0 to 06fc4df Compare December 4, 2025 21:06
@ajrasane ajrasane enabled auto-merge (squash) December 4, 2025 22:48
@gcunhase
Copy link
Contributor

gcunhase commented Dec 5, 2025

Can you please add the accuracy for the baseline model for comparison? FP32 or FP16 should be okay. Thanks.

Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
@ajrasane ajrasane force-pushed the ajrasane/mxfp8_export branch from 06fc4df to b685019 Compare December 5, 2025 20:06
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
@ajrasane ajrasane merged commit 93b28d0 into main Dec 5, 2025
36 checks passed
@ajrasane ajrasane deleted the ajrasane/mxfp8_export branch December 5, 2025 22:03
kevalmorabia97 pushed a commit that referenced this pull request Dec 7, 2025
## What does this PR do?

**Type of change:** 
New feature

**Overview:** ?
- Implemented functions for the MXFP8 quant exporter
- Integrated autocast for converting model to fp16
- deprecated quantize_weights_to_mxfp8
- Updated tests

## Usage
```
python torch_quant_to_onnx.py --quantize_mode=mxfp8    --onnx_save_path=vit_base_patch16_224.mxfp8.onnx        --calibration_data_size 64      --batch_size 128
```

## Testing
```
python evaluate.py --onnx_path=vit_base_patch16_224.mxfp8.onnx         --model_name=vit_base_patch16_224       --results_path=./results.txt    --batch_size 128
```
Accuracy and latency results
```
The top1 accuracy of the model is 85.07%
The top5 accuracy of the model is 97.558%
Inference latency of the model is 6.65451 ms
```

## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->

- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: No
- deprecated quantize_weights_to_mxfp8
- **Did you write any new necessary tests?**: No
- **Did you add or update any necessary documentation?**: No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
No <!--- Only for new features, API changes, critical bug fixes or bw
breaking changes. -->

---------

Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants