Commit 93b28d0
authored
[OMNIML-2244] Create the MXFP8 quant exporter (#634)
## What does this PR do?
**Type of change:**
New feature
**Overview:** ?
- Implemented functions for the MXFP8 quant exporter
- Integrated autocast for converting model to fp16
- deprecated quantize_weights_to_mxfp8
- Updated tests
## Usage
```
python torch_quant_to_onnx.py --quantize_mode=mxfp8 --onnx_save_path=vit_base_patch16_224.mxfp8.onnx --calibration_data_size 64 --batch_size 128
```
## Testing
```
python evaluate.py --onnx_path=vit_base_patch16_224.mxfp8.onnx --model_name=vit_base_patch16_224 --results_path=./results.txt --batch_size 128
```
Accuracy and latency results
```
The top1 accuracy of the model is 85.07%
The top5 accuracy of the model is 97.558%
Inference latency of the model is 6.65451 ms
```
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: No
- deprecated quantize_weights_to_mxfp8
- **Did you write any new necessary tests?**: No
- **Did you add or update any necessary documentation?**: No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
No <!--- Only for new features, API changes, critical bug fixes or bw
breaking changes. -->
---------
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>1 parent 3ef9e39 commit 93b28d0
File tree
4 files changed
+178
-123
lines changed- modelopt
- onnx
- export
- quantization
- torch/_deploy/utils
- tests/unit/onnx
4 files changed
+178
-123
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
19 | 27 | | |
20 | 28 | | |
21 | 29 | | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
22 | 66 | | |
23 | | - | |
24 | 67 | | |
25 | 68 | | |
26 | 69 | | |
27 | 70 | | |
28 | 71 | | |
29 | 72 | | |
| 73 | + | |
30 | 74 | | |
31 | 75 | | |
32 | 76 | | |
33 | | - | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
34 | 108 | | |
35 | 109 | | |
36 | 110 | | |
37 | | - | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
38 | 144 | | |
39 | 145 | | |
40 | 146 | | |
41 | | - | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
35 | | - | |
| 34 | + | |
36 | 35 | | |
37 | 36 | | |
38 | 37 | | |
| |||
1036 | 1035 | | |
1037 | 1036 | | |
1038 | 1037 | | |
1039 | | - | |
1040 | | - | |
1041 | | - | |
1042 | | - | |
1043 | | - | |
1044 | | - | |
1045 | | - | |
1046 | | - | |
1047 | | - | |
1048 | | - | |
1049 | | - | |
1050 | | - | |
1051 | | - | |
1052 | | - | |
1053 | | - | |
1054 | | - | |
1055 | | - | |
1056 | | - | |
1057 | | - | |
1058 | | - | |
1059 | | - | |
1060 | | - | |
1061 | | - | |
1062 | | - | |
1063 | | - | |
1064 | | - | |
1065 | | - | |
1066 | | - | |
1067 | | - | |
1068 | | - | |
1069 | | - | |
1070 | | - | |
1071 | | - | |
1072 | | - | |
1073 | | - | |
1074 | | - | |
1075 | | - | |
1076 | | - | |
1077 | | - | |
1078 | | - | |
1079 | | - | |
1080 | | - | |
1081 | | - | |
1082 | | - | |
1083 | | - | |
1084 | | - | |
1085 | | - | |
1086 | | - | |
1087 | | - | |
1088 | | - | |
1089 | | - | |
1090 | | - | |
1091 | | - | |
1092 | | - | |
1093 | | - | |
1094 | | - | |
1095 | | - | |
1096 | | - | |
1097 | | - | |
1098 | | - | |
1099 | | - | |
1100 | | - | |
1101 | | - | |
1102 | | - | |
1103 | | - | |
1104 | | - | |
1105 | | - | |
1106 | | - | |
1107 | | - | |
1108 | | - | |
1109 | | - | |
1110 | | - | |
1111 | | - | |
1112 | | - | |
1113 | | - | |
1114 | | - | |
1115 | | - | |
1116 | | - | |
1117 | | - | |
1118 | | - | |
1119 | | - | |
1120 | | - | |
1121 | | - | |
1122 | | - | |
1123 | | - | |
1124 | | - | |
1125 | | - | |
1126 | | - | |
1127 | | - | |
1128 | | - | |
1129 | | - | |
1130 | | - | |
1131 | | - | |
1132 | | - | |
1133 | | - | |
1134 | | - | |
1135 | | - | |
1136 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
| 43 | + | |
48 | 44 | | |
49 | 45 | | |
50 | 46 | | |
| |||
364 | 360 | | |
365 | 361 | | |
366 | 362 | | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
367 | 368 | | |
368 | 369 | | |
369 | 370 | | |
| |||
560 | 561 | | |
561 | 562 | | |
562 | 563 | | |
563 | | - | |
| 564 | + | |
564 | 565 | | |
565 | | - | |
566 | | - | |
567 | | - | |
568 | 566 | | |
569 | 567 | | |
570 | 568 | | |
| |||
575 | 573 | | |
576 | 574 | | |
577 | 575 | | |
578 | | - | |
| 576 | + | |
579 | 577 | | |
580 | 578 | | |
581 | 579 | | |
| |||
0 commit comments