Commit 3ef9e39
authored
Fix BertSdpaSelfAttention quantization (#648)
## What does this PR do?
**Type of change:** Bug fix
**Overview:**
`BertSDPASelfAttention` quantization was failing
Issue: Decorators (e.g., @deprecate_kwarg in BERT) wrap methods and
change their free variables. The modified AST needs __class__ for
super(), but the decorated wrapper has different freevars (e.g.,
'additional_message', 'func').
Fix: Search the decorator's closure to find the actual undecorated
method with matching free variables, then use its globals and closure
for bytecode reconstruction.
## Usage
```
cd examples/chained_optimizations
bash scripts/1_prune.sh
bash scripts/2_int8_quantize.sh
```
## Testing
Added unit test: test_kv_quant_bert
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: Yes
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
NA
## Additional Information
<!-- E.g. related issue. -->
Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>1 parent c6c9905 commit 3ef9e39
File tree
3 files changed
+86
-12
lines changed- modelopt/torch/quantization/plugins
- tests
- _test_utils/torch
- unit/torch/quantization/plugins
3 files changed
+86
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
271 | 271 | | |
272 | 272 | | |
273 | 273 | | |
| 274 | + | |
274 | 275 | | |
275 | 276 | | |
276 | 277 | | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
277 | 305 | | |
278 | 306 | | |
279 | | - | |
| 307 | + | |
280 | 308 | | |
281 | | - | |
282 | | - | |
283 | | - | |
| 309 | + | |
284 | 310 | | |
285 | 311 | | |
286 | | - | |
287 | | - | |
288 | | - | |
289 | | - | |
290 | | - | |
291 | | - | |
292 | | - | |
293 | 312 | | |
294 | 313 | | |
295 | 314 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
88 | 106 | | |
89 | 107 | | |
90 | 108 | | |
| |||
Lines changed: 38 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
| 75 | + | |
75 | 76 | | |
76 | 77 | | |
77 | 78 | | |
| |||
110 | 111 | | |
111 | 112 | | |
112 | 113 | | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
0 commit comments