Don't constant fold Quantize/DequantizeLinear nodes by default #2713

ruro · 2025-11-21T05:07:45Z

I added support for exporting QuantizeLinear/DequantizeLinear nodes (from fake_quantize_per_*_affine torch operators) in a previous PR.

Unfortunately, the current default onnxscript optimizer settings tend to automatically remove any weight quantization. This is because the Weight -> QDQ -> ... pattern looks like it can be just constant folded to QDQ(Weight) -> ....

I believe that this behavior is not desirable, since the presence of QDQ nodes in the graph is what allows inference engines to run the supported computations using quantized data types. So the purpose of QDQ nodes is to hold the relevant quantization "metadata". As such, they normally shouldn't be constant folded.

I have extended the existing logic in FoldConstantsPass that was used to exclude ConstantOfShape from constant folding.

I haven't found any tests verifying this behavior for ConstantOfShape and I'm not sure, how to set up such a unit test, so I have left this code untested for now. If adding tests is mandatory, please give me a hint on where should I add such a test and what would be the best way to check/assert that the optimized graph matches the expectations (hopefully without reinventing the wheel or manually introspecting the ir.Model object).

codecov · 2025-11-24T11:21:10Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.10%. Comparing base (9dbf685) to head (a377a0b).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2713   +/-   ##
=======================================
  Coverage   70.10%   70.10%           
=======================================
  Files         226      226           
  Lines       27226    27228    +2     
  Branches     2747     2748    +1     
=======================================
+ Hits        19086    19088    +2     
  Misses       7194     7194           
  Partials      946      946

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ruro added 2 commits November 21, 2025 07:48

refactor FoldConstantsPass default rules handling

e6c62a1

don't constant fold Quantize/DequantizeLinear nodes by default

a377a0b

github-project-automation bot added this to ONNX Script Review Board Nov 21, 2025

github-project-automation bot moved this to Todo in ONNX Script Review Board Nov 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't constant fold Quantize/DequantizeLinear nodes by default #2713

Don't constant fold Quantize/DequantizeLinear nodes by default #2713

Uh oh!

ruro commented Nov 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Don't constant fold Quantize/DequantizeLinear nodes by default #2713

Are you sure you want to change the base?

Don't constant fold Quantize/DequantizeLinear nodes by default #2713

Uh oh!

Conversation

ruro commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ruro commented Nov 21, 2025 •

edited

Loading

codecov bot commented Nov 24, 2025 •

edited

Loading