Releases · huggingface/optimum-quanto

29 Oct 17:13

dacorvo

v0.2.6

f3c400e

release: 0.2.6 Latest

Latest

What's Changed

Add hip support by @Disty0 in #330
Switched linters, black -> ruff by @ishandeva in #334
Add marlin int4 kernel by @dacorvo and @shcho1118 in #333
fix: use reshape instead of view by @dacorvo in #338
Support QLayerNorm without weights by @dacorvo in #341

New Contributors

@ishandeva made their first contribution in #334
@Disty0 made their first contribution in #330
@shcho1118 made their first contribution in #333

Full Changelog: v0.2.5...v0.2.6

Contributors

dacorvo, shcho1118, and 2 other contributors

Assets 2

01 Oct 11:52

dacorvo

v0.2.5

d802bfa

release: 0.2.5

New features

Load and save models from the Hugging Face hub #263 by @sayakpaul
Add support for float8 e4f3mnuz #310 (from #281) by @maktukmak
Faster and less memory-intensive requantization #290 by @latentCall145
Support torch.equal for QTensor #294 by @dacorvo
Add Marlin Float8 kernel #296 (from #241) by @fxmarty
Add Whisper for speech recognition example #298 (from #242) by @mattiadg
Add ViT classification example #308 by @shovan777

Bug fixes

Fix include patterns in quantize #271 by @kaibioinfo
Enable non-strict loading of state dicts #295 by @BenjaminBossan
Fix transformers forward error #303 by @dacorvo
Fix missing call in transformers models #325 by @dacorvo
Fix 8-bit mm calls for 4D inputs #326 by @dacorvo

Full Changelog: v0.2.4...v0.2.5

Contributors

kaibioinfo, dacorvo, and 7 other contributors

Assets 2

26 Jul 09:25

dacorvo

v0.2.4

832f7f5

release: 0.2.4

Bug Fixes:

fix import error in optimum-cli when diffusers is not installed by @dacorvo

Full Changelog: v0.2.3...v0.2.4

Contributors

dacorvo

Assets 2

25 Jul 15:18

dacorvo

v0.2.3

4b71e42

release: 0.2.3

What's Changed

Use new int8 torch kernels by @dacorvo in #222
Rebuild extension when pytorch is updated by @dacorvo in #223
Use tinygemm bfloat16 / int4 kernel whenever possible by @dacorvo in #234
Add HQQ optimizer by @dacorvo in #235
Add QuantizedModelForCausalLM by @dacorvo in #243
Integrate quanto commands to optimum-cli by @dacorvo in #244
Add pixart-sigma test to image example by @dacorvo in #247
Support diffusion models. by @sayakpaul in #255

Bug fixes

Fix: align extension on max arch by @dacorvo in #227
Fix TinyGemmQBitsTensor move by @dacorvo in #246
Fix stream-lining bug by @dacorvo in #249
Fix float/int8 matrix multiplication latency regression by @dacorvo in #250
Fix serialization issues by @dacorvo in #258

New Contributors

@sayakpaul made their first contribution in #255

Full Changelog: v0.2.2...v0.2.3

Contributors

dacorvo and sayakpaul

Assets 2

28 Jun 15:30

dacorvo

v0.2.2

8c8aa97

release: 0.2.2

New features

add OWLv2 detection example by @dacorvo,
use new torch quantization kernels by @dacorvo.

Bug fixes

avoid CUDA compilation errors on older Nvidia cards (pre Ampere) by @dacorvo,
recompile extensions when pytorch is updated and prevent segfault by @dacorvo.

Contributors

dacorvo

Assets 2

31 May 15:24

dacorvo

v0.2.1

1f35ceb

release: 0.2.1

This release does not contain any new feature, but it is the first one with the new package name.

Assets 2

24 May 10:49

dacorvo

v0.2.0

96ab5d3

release: 0.2.0

New features

requantize helper by @calmitchell617,
StableDiffusion example by @thliang01,
improved linear backward path by @dacorvo ,
AWQ int4 kernels by @dacorvo .

Contributors

dacorvo, thliang01, and calmitchell617

Assets 2

13 Mar 08:43

dacorvo

v0.1.0

fe2b313

release: v0.1.0

New features

group-wise quantization,
safe serialization.

Assets 2

23 Feb 17:11

dacorvo

v0.0.13

addd712

release(0.0.13):

New features

new QConv2d quantized module,
official support for float8 weights.

Bug fixes

fix QbitsTensor.to() that was not moving the inner tensors,
prevent shallow QTensor copies when loading weights that do not move inner tensors.

Assets 2

16 Feb 17:04

dacorvo

v0.0.12

d79f795

release: 0.0.12

New features

quanto kernels library (not used for now in quantize).

Breaking changes

quantization types are now all quanto.dtype

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

New features

Bug fixes

Contributors

Bug Fixes:

Contributors

What's Changed

Bug fixes

New Contributors

Contributors

New features

Bug fixes

Contributors

New features

Contributors

New features

New features

Bug fixes

New features

Breaking changes

Releases: huggingface/optimum-quanto

release: 0.2.6

What's Changed

New Contributors

Contributors

release: 0.2.5

New features

Bug fixes

Contributors

release: 0.2.4

Bug Fixes:

Contributors

release: 0.2.3

What's Changed

Bug fixes

New Contributors

Contributors

release: 0.2.2

New features

Bug fixes

Contributors

release: 0.2.1

release: 0.2.0

New features

Contributors

release: v0.1.0

New features

release(0.0.13):

New features

Bug fixes

release: 0.0.12

New features

Breaking changes