Skip to content

Releases: huggingface/optimum-quanto

release: 0.2.6

29 Oct 17:13
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.2.5...v0.2.6

release: 0.2.5

01 Oct 11:52
Compare
Choose a tag to compare

New features

Bug fixes

Full Changelog: v0.2.4...v0.2.5

release: 0.2.4

26 Jul 09:25
Compare
Choose a tag to compare

Bug Fixes:

  • fix import error in optimum-cli when diffusers is not installed by @dacorvo

Full Changelog: v0.2.3...v0.2.4

release: 0.2.3

25 Jul 15:18
Compare
Choose a tag to compare

What's Changed

Bug fixes

New Contributors

Full Changelog: v0.2.2...v0.2.3

release: 0.2.2

28 Jun 15:30
Compare
Choose a tag to compare

New features

  • add OWLv2 detection example by @dacorvo,
  • use new torch quantization kernels by @dacorvo.

Bug fixes

  • avoid CUDA compilation errors on older Nvidia cards (pre Ampere) by @dacorvo,
  • recompile extensions when pytorch is updated and prevent segfault by @dacorvo.

release: 0.2.1

31 May 15:24
Compare
Choose a tag to compare

This release does not contain any new feature, but it is the first one with the new package name.

release: 0.2.0

24 May 10:49
Compare
Choose a tag to compare

New features

release: v0.1.0

13 Mar 08:43
Compare
Choose a tag to compare

New features

  • group-wise quantization,
  • safe serialization.

release(0.0.13):

23 Feb 17:11
Compare
Choose a tag to compare

New features

  • new QConv2d quantized module,
  • official support for float8 weights.

Bug fixes

  • fix QbitsTensor.to() that was not moving the inner tensors,
  • prevent shallow QTensor copies when loading weights that do not move inner tensors.

release: 0.0.12

16 Feb 17:04
Compare
Choose a tag to compare

New features

  • quanto kernels library (not used for now in quantize).

Breaking changes

  • quantization types are now all quanto.dtype