Release Release 2.2.0 · sony/model_optimization

What's Changed

This release includes breaking changes in the Target Platform Capabilities module (TPC). If you use a custom TPC, be sure to review the Breaking changes section.

General changes

Quantization enhancements:
- Improved Hessian information computation runtime: speeds-up GPTQ, HMSE and Mixed Precision with Hessian-based loss.
  - get_keras_gptq_config and get_pytorch_gptq_config functions now allow to get hessian_batch_size argument to control the size of the batch in Hessian computation for GPTQ.
- Data Generation Upgrade: Improved Speed, Performance and Coverage.
  - Add SmoothAugmentationImagePipeline – an image pipeline implementation that includes gaussian smoothing and random cropping and clipping.
  - Improved performance with float16 support in PyTorch.
  - Introduced ReduceLROnPlateauWithReset scheduler – a learning rate scheduler which reduce learning rate when a metric has stopped improving and allows resetting the learning rate to the initial value after a specified number of bad epochs.
- Shift negative correction for activations:
  - Update shift negative for GELU activation operator.
  - Enable shift negative correction by default in QuantizationConfig in CoreConfig.
Introduce new Explainable Quantization (Xquant) tool (experimental):
- Generate a report (viewable in TensorBoard) to troubleshoot performance issues, with histograms and similarity metrics to compare float and quantized models.
- Xquant tutorial in available in PyTorch and Keras.
Introduced TPC IMX500.v3 (experimental):
- Support constants quantization. Constants Add, Sub, Mul & Div operators will be quantized to 8 bits Power of Two quantization, per-axis. Axis is chosen per constant according to minimum quantization error.
- IMX500 TPC now supports 16-bit activation quantization for the following operators: Add, Sub, Mul, Concat & Stack.
- Support assigning allowed input precision options to each operator, that is, the precision representation of the input activation tensor of the operator.
- Default TPC remains IMX500.v1.
- For selecting IMX500.v3 in keras:
  - tpc_v3 = mct.get_target_platform_capabilities("tensorflow", 'imx500', target_platform_version="v3")
  - mct.ptq.keras_post_training_quantization(model, representative_data_gen, target_platform_capabilities=tpc_v3)
- For selecting IMX500.v3 in pytorch:
  - tpc_v3 = mct.get_target_platform_capabilities("pytorch", 'imx500', target_platform_version="v3")
  - mct.ptq. pytorch_post_training_quantization(model, representative_data_gen, target_platform_capabilities=tpc_v3)
Introduced BitWidthConfig API:
- Allow manual adjustment of activation bit-widths for specific model layers through a new class under CoreConfig.
- Usage example of manual selection of 16bit activations available at PyTorch object detection YOLOv8n tutorial.
Tutorials:
- MCT tutorial notebooks updates:
  - Added new tutorials for IMX500:
    - instance segmentation YOLOv8n and a pose estimation YOLOv8n quantization in PyTorch, including an optional Gradient-Based PTQ step for optimized performance.
    - A torchvision model quantization for IMX500.
  - Added new classification models to MCT’s IMX500-Notebooks.
- Added new MCT features tutorials: Xquant tutorial in PyTorch and Keras. In addition, a new tutorial for GPTQ in PyTorch has been added.
- Update PyTorch object detection YOLOv8n tutorial with 16 bits manual configuration.

Breaking changes

To configure OpQuantizationConfig in the TPC, an additional arguments has been added:
- Signedness specifies the signedness of the quantization method (signed or unsigned quantization).
- supported_input_activation_n_bits sets the number of bits that operator accepts as input.

Bug fixes:

Fixed a bug in PyTorch model reader of reshape operator #1086.
Fixed a bug in GPTQ with bias learning for cases that a convolutional layer with None as a bias #1109.
Fixed an issue with mixed precision where when running only weights/activation compression with mixed precision. If layers with multiple candidates of the other (activation/weights) exist, the search would fail or be incorrect. A new filtering procedure has been added before running mixed precision, to filter out unnecessary candidates #1162.

New Contributors

Welcome @DaniAffCH, @irenaby, @yardeny-sony for their first contribution! PR #1094, PR #1118, PR #1163

Full Changelog: v2.1.0...v2.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2.2.0

What's Changed

General changes

Breaking changes

Bug fixes:

New Contributors

Contributors