Skip to content

Release 2.2.0

Latest
Compare
Choose a tag to compare
@ofirgo ofirgo released this 25 Aug 11:00
· 22 commits to main since this release
bb24123

What's Changed

General changes

  • Quantization enhancements:

    • Improved Hessian information computation runtime: speeds-up GPTQ, HMSE and Mixed Precision with Hessian-based loss.

      • get_keras_gptq_config and get_pytorch_gptq_config functions now allow to get hessian_batch_size argument to control the size of the batch in Hessian computation for GPTQ.
    • Data Generation Upgrade: Improved Speed, Performance and Coverage.

      • Add SmoothAugmentationImagePipeline – an image pipeline implementation that includes gaussian smoothing and random cropping and clipping.
      • Improved performance with float16 support in PyTorch.
      • Introduced ReduceLROnPlateauWithReset scheduler – a learning rate scheduler which reduce learning rate when a metric has stopped improving and allows resetting the learning rate to the initial value after a specified number of bad epochs.
    • Shift negative correction for activations:

  • Introduce new Explainable Quantization (Xquant) tool (experimental):

    • Generate a report (viewable in TensorBoard) to troubleshoot performance issues, with histograms and similarity metrics to compare float and quantized models.
    • Xquant tutorial in available in PyTorch and Keras.
  • Introduced TPC IMX500.v3 (experimental):

    • Support constants quantization. Constants Add, Sub, Mul & Div operators will be quantized to 8 bits Power of Two quantization, per-axis. Axis is chosen per constant according to minimum quantization error.
    • IMX500 TPC now supports 16-bit activation quantization for the following operators: Add, Sub, Mul, Concat & Stack.
    • Support assigning allowed input precision options to each operator, that is, the precision representation of the input activation tensor of the operator.
    • Default TPC remains IMX500.v1.
    • For selecting IMX500.v3 in keras:
      • tpc_v3 = mct.get_target_platform_capabilities("tensorflow", 'imx500', target_platform_version="v3")
      • mct.ptq.keras_post_training_quantization(model, representative_data_gen, target_platform_capabilities=tpc_v3)
    • For selecting IMX500.v3 in pytorch:
      • tpc_v3 = mct.get_target_platform_capabilities("pytorch", 'imx500', target_platform_version="v3")
      • mct.ptq. pytorch_post_training_quantization(model, representative_data_gen, target_platform_capabilities=tpc_v3)
  • Introduced BitWidthConfig API:

  • Tutorials:

Breaking changes

  • To configure OpQuantizationConfig in the TPC, an additional arguments has been added:
    • Signedness specifies the signedness of the quantization method (signed or unsigned quantization).
    • supported_input_activation_n_bits sets the number of bits that operator accepts as input.

Bug fixes:

  • Fixed a bug in PyTorch model reader of reshape operator #1086.
  • Fixed a bug in GPTQ with bias learning for cases that a convolutional layer with None as a bias #1109.
  • Fixed an issue with mixed precision where when running only weights/activation compression with mixed precision. If layers with multiple candidates of the other (activation/weights) exist, the search would fail or be incorrect. A new filtering procedure has been added before running mixed precision, to filter out unnecessary candidates #1162.

New Contributors

Welcome @DaniAffCH, @irenaby, @yardeny-sony for their first contribution! PR #1094, PR #1118, PR #1163

Full Changelog: v2.1.0...v2.2.0