8.0b2 Release #2308

jakesabathia2 · 2024-08-13T04:17:43Z

CI: https://gitlab.com/coremltools1/coremltools/-/pipelines/1413983249 ✅

Release Notes

Support for Latest Dependencies
- Compatible with the latest protobuf python package: Improves serialization latency.
- Compatible with numpy 2.0.
- Supports scikit-learn 1.5.
New Core ML model utils
- coremltools.models.utils.bisect_model can break a large Core ML model into two smaller models with similar sizes.
- coremltools.models.utils.materialize_dynamic_shape_mlmodel can convert a flexible input shape model into a static input shape model.
New compression features in coremltools.optimize.coreml
- Vector palettization: By setting cluster_dim > 1 in coremltools.optimize.coreml.OpPalettizerConfig, you can do the vector palettization, where each entry in the lookup table is a vector of length cluster_dim.
- Palettization of per channel scale: By setting enable_per_channel_scale=True in coremltools.optimize.coreml.OpPalettizerConfig, weights are normalized along the output channel using per channel scales before being palettized.
- Joint compression: A new pattern is supported, where weights are first quantized to int8 and then palettized into n-bit look-up table with int8 entries.
- Support conversion of palettized model with 8bits LUT produced from coremltools.optimize.torch.
New compression features / bug fixes in coremltools.optimize.torch
- Added conversion support for Torch models jointly compressed using the training time APIs in coremltools.optimize.torch .
- Added vector palettization support to SKMPalettizer .
- Fixed bug in construction of weight vectors along output channel for vector palettization with PostTrainingPalettizer and DKMPalettizer .
- Deprecated cluter_dtype option in favor of lut_dtype in ModuleDKMPalettizerConfig .
- Added support for quantizing ConvTranspose modules with PostTrainingQuantizer and LinearQuantizer .
- Added static grouping for activation heuristic in GPTQ.
- Fixed bug in how quantization scales are computed for Conv2D layer with per-block quantization in GPTQ .
- Can now perform activation only quantization with QAT APIs.
Experimental torch.export conversion support
- Support conversion of stateful models with mutable buffer.
- Support conversion of dynamic inputs shape models.
- Support conversion of 4-bit weight compression models.
Support new torch ops: clip .
Various other bug fixes, enhancements, clean ups and optimizations.

YifanShenSZ

Let's goooooo

jakesabathia2 force-pushed the 8.0b2-release branch from 9c85bf7 to a19047c Compare August 13, 2024 07:23

8.0b2 Release

1be7a9a

jakesabathia2 force-pushed the 8.0b2-release branch from a19047c to 1be7a9a Compare August 15, 2024 03:15

jakesabathia2 requested review from YifanShenSZ and DawerG August 15, 2024 04:18

jakesabathia2 assigned aseemw Aug 15, 2024

jakesabathia2 requested a review from junpeiz August 15, 2024 04:18

jakesabathia2 unassigned aseemw Aug 15, 2024

jakesabathia2 requested review from aseemw and TobyRoseman August 15, 2024 04:18

jakesabathia2 changed the title ~~[WIP] 8.0b2 Release~~ 8.0b2 Release Aug 15, 2024

YifanShenSZ approved these changes Aug 15, 2024

View reviewed changes

junpeiz approved these changes Aug 15, 2024

View reviewed changes

jakesabathia2 merged commit 5e2460f into main Aug 16, 2024

jakesabathia2 deleted the 8.0b2-release branch August 16, 2024 00:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8.0b2 Release #2308

8.0b2 Release #2308

jakesabathia2 commented Aug 13, 2024 •

edited

Loading

YifanShenSZ left a comment

8.0b2 Release #2308

8.0b2 Release #2308

Conversation

jakesabathia2 commented Aug 13, 2024 • edited Loading

Release Notes

YifanShenSZ left a comment

Choose a reason for hiding this comment

jakesabathia2 commented Aug 13, 2024 •

edited

Loading