Fix BertSdpaSelfAttention quantization #648

kinjalpatel27 · 2025-12-04T22:27:20Z

What does this PR do?

Type of change: Bug fix

Overview:
BertSDPASelfAttention quantization was failing
Issue: Decorators (e.g., @deprecate_kwarg in BERT) wrap methods and change their free variables. The modified AST needs class for super(), but the decorated wrapper has different freevars (e.g., 'additional_message', 'func').

Fix: Search the decorator's closure to find the actual undecorated method with matching free variables, then use its globals and closure for bytecode reconstruction.

Usage

cd examples/chained_optimizations
bash scripts/1_prune.sh
bash scripts/2_int8_quantize.sh

Testing

Added unit test: test_kv_quant_bert

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: Yes
Did you add or update any necessary documentation?: Yes
Did you update Changelog?: NA

Additional Information

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

codecov · 2025-12-04T22:39:59Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.68%. Comparing base (097037d) to head (39b3cc7).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #648   +/-   ##
=======================================
  Coverage   74.68%   74.68%           
=======================================
  Files         183      183           
  Lines       18552    18552           
=======================================
  Hits        13856    13856           
  Misses       4696     4696

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

## What does this PR do? **Type of change:** Bug fix **Overview:** `BertSDPASelfAttention` quantization was failing Issue: Decorators (e.g., @deprecate_kwarg in BERT) wrap methods and change their free variables. The modified AST needs __class__ for super(), but the decorated wrapper has different freevars (e.g., 'additional_message', 'func'). Fix: Search the decorator's closure to find the actual undecorated method with matching free variables, then use its globals and closure for bytecode reconstruction. ## Usage ``` cd examples/chained_optimizations bash scripts/1_prune.sh bash scripts/2_int8_quantize.sh ``` ## Testing Added unit test: test_kv_quant_bert ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes - **Did you write any new necessary tests?**: Yes - **Did you add or update any necessary documentation?**: Yes - **Did you update [Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**: NA ## Additional Information  Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

kinjalpatel27 requested a review from a team as a code owner December 4, 2025 22:27

kinjalpatel27 requested a review from sychen52 December 4, 2025 22:27

Fix BertSdpaSelfAttention quantization

39b3cc7

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

kinjalpatel27 force-pushed the kinjal/int8_quant branch from 9b97f1b to 39b3cc7 Compare December 4, 2025 22:29

realAsma approved these changes Dec 4, 2025

View reviewed changes

kinjalpatel27 merged commit 3ef9e39 into main Dec 5, 2025
36 checks passed

kinjalpatel27 deleted the kinjal/int8_quant branch December 5, 2025 03:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix BertSdpaSelfAttention quantization #648

Fix BertSdpaSelfAttention quantization #648

Uh oh!

kinjalpatel27 commented Dec 4, 2025 •

edited

Loading

Uh oh!

codecov bot commented Dec 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix BertSdpaSelfAttention quantization #648

Fix BertSdpaSelfAttention quantization #648

Uh oh!

Conversation

kinjalpatel27 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

codecov bot commented Dec 4, 2025

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kinjalpatel27 commented Dec 4, 2025 •

edited

Loading