Is your feature request related to a problem? Please describe.
I am the author of SDNQ quantizer and trying to add native support for pre-quantized models to ModelMixin.from_pretrained.
I need to update the expected and unexpected keys so Diffusers will pass the quantization scales / zero_points as a parameter to create_quantized_param and won't unnecessarily warn about unexpected keys that was actually used in the quantized model:
Describe the solution you'd like.
Add update_unexpected_keys and update_expected_keys APIs to DiffusersQuantizer.
Example implementation can be found in Transformers HfQuantizer.
Describe alternatives you've considered.
I am currently using the raw state_dict to load the quantization scales / zero_points as a workaround.
Additional context.
Current SDNQ Quantizer code: https://github.com/Disty0/sdnq/blob/25cc7506af516f15d68ec17c7db0a3c5c20de3d3/src/sdnq/quantizer.py#L595-L602