Skip to content

Conversation

@kinjalpatel27
Copy link
Contributor

@kinjalpatel27 kinjalpatel27 commented Dec 4, 2025

What does this PR do?

Type of change: Bug fix

Overview:
BertSDPASelfAttention quantization was failing
Issue: Decorators (e.g., @deprecate_kwarg in BERT) wrap methods and change their free variables. The modified AST needs class for super(), but the decorated wrapper has different freevars (e.g., 'additional_message', 'func').

Fix: Search the decorator's closure to find the actual undecorated method with matching free variables, then use its globals and closure for bytecode reconstruction.

Usage

cd examples/chained_optimizations
bash scripts/1_prune.sh
bash scripts/2_int8_quantize.sh

Testing

Added unit test: test_kv_quant_bert

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: Yes
  • Did you add or update any necessary documentation?: Yes
  • Did you update Changelog?: NA

Additional Information

@kinjalpatel27 kinjalpatel27 requested a review from a team as a code owner December 4, 2025 22:27
Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>
@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.68%. Comparing base (097037d) to head (39b3cc7).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #648   +/-   ##
=======================================
  Coverage   74.68%   74.68%           
=======================================
  Files         183      183           
  Lines       18552    18552           
=======================================
  Hits        13856    13856           
  Misses       4696     4696           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kinjalpatel27 kinjalpatel27 merged commit 3ef9e39 into main Dec 5, 2025
36 checks passed
@kinjalpatel27 kinjalpatel27 deleted the kinjal/int8_quant branch December 5, 2025 03:18
kevalmorabia97 pushed a commit that referenced this pull request Dec 7, 2025
## What does this PR do?

**Type of change:**  Bug fix

**Overview:** 
`BertSDPASelfAttention` quantization was failing
Issue: Decorators (e.g., @deprecate_kwarg in BERT) wrap methods and
change their free variables. The modified AST needs __class__ for
super(), but the decorated wrapper has different freevars (e.g.,
'additional_message', 'func').
     
Fix: Search the decorator's closure to find the actual undecorated
method with matching free variables, then use its globals and closure
for bytecode reconstruction.

## Usage

```
cd examples/chained_optimizations
bash scripts/1_prune.sh
bash scripts/2_int8_quantize.sh
```

## Testing
Added unit test: test_kv_quant_bert

## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->

- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: Yes
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
NA
## Additional Information
<!-- E.g. related issue. -->

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>
soodoshll pushed a commit to soodoshll/TensorRT-Model-Optimizer that referenced this pull request Dec 8, 2025
## What does this PR do?

**Type of change:**  Bug fix

**Overview:** 
`BertSDPASelfAttention` quantization was failing
Issue: Decorators (e.g., @deprecate_kwarg in BERT) wrap methods and
change their free variables. The modified AST needs __class__ for
super(), but the decorated wrapper has different freevars (e.g.,
'additional_message', 'func').
     
Fix: Search the decorator's closure to find the actual undecorated
method with matching free variables, then use its globals and closure
for bytecode reconstruction.

## Usage

```
cd examples/chained_optimizations
bash scripts/1_prune.sh
bash scripts/2_int8_quantize.sh
```

## Testing
Added unit test: test_kv_quant_bert

## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->

- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: Yes
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
NA
## Additional Information
<!-- E.g. related issue. -->

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants