Skip to content

Conversation

@ruro
Copy link
Contributor

@ruro ruro commented Nov 21, 2025

I added support for exporting QuantizeLinear/DequantizeLinear nodes (from fake_quantize_per_*_affine torch operators) in a previous PR.

Unfortunately, the current default onnxscript optimizer settings tend to automatically remove any weight quantization. This is because the Weight -> QDQ -> ... pattern looks like it can be just constant folded to QDQ(Weight) -> ....

I believe that this behavior is not desirable, since the presence of QDQ nodes in the graph is what allows inference engines to run the supported computations using quantized data types. So the purpose of QDQ nodes is to hold the relevant quantization "metadata". As such, they normally shouldn't be constant folded.

I have extended the existing logic in FoldConstantsPass that was used to exclude ConstantOfShape from constant folding.

I haven't found any tests verifying this behavior for ConstantOfShape and I'm not sure, how to set up such a unit test, so I have left this code untested for now. If adding tests is mandatory, please give me a hint on where should I add such a test and what would be the best way to check/assert that the optimized graph matches the expectations (hopefully without reinventing the wheel or manually introspecting the ir.Model object).

@codecov
Copy link

codecov bot commented Nov 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.10%. Comparing base (9dbf685) to head (a377a0b).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2713   +/-   ##
=======================================
  Coverage   70.10%   70.10%           
=======================================
  Files         226      226           
  Lines       27226    27228    +2     
  Branches     2747     2748    +1     
=======================================
+ Hits        19086    19088    +2     
  Misses       7194     7194           
  Partials      946      946           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant