Refactor quantization tensor data representation #2479

laggui · 2024-11-11T20:25:40Z

Checklist

Confirmed that run-checks all script has been executed.

Changes

DType::QFloat no longer contains the QuantizationStrategy with quantization parameters as it is not known at compile-time. Instead, use the existing QuantizationScheme enum which only indicates the method and dtype.

Also changed the representation to pack the quantized values (as was previously done in burn-jit exclusively) and include the quantization parameters as the bytes.

Testing

Added unit tests for packed/unpacked quantization parameters.

…ms unknown at compile-time) Instead, the qparams are stored in the TensorData bytes so we can pack/unpack them based on the scheme

… type into u32

codecov · 2024-11-11T20:54:02Z

Codecov Report

Attention: Patch coverage is 86.53295% with 47 lines in your changes missing coverage. Please review.

Project coverage is 82.95%. Comparing base (6e71aaf) to head (3d7ca13).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
crates/burn-tch/src/ops/qtensor.rs	0.00%	32 Missing ⚠️
crates/burn-tensor/src/tensor/data.rs	94.11%	11 Missing ⚠️
...ates/burn-tensor/src/tensor/quantization/scheme.rs	0.00%	2 Missing ⚠️
crates/burn-ndarray/src/ops/qtensor.rs	98.03%	1 Missing ⚠️
crates/burn-tensor/src/tensor/ops/qtensor.rs	92.30%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2479      +/-   ##
==========================================
+ Coverage   82.91%   82.95%   +0.03%     
==========================================
  Files         810      811       +1     
  Lines      104904   105057     +153     
==========================================
+ Hits        86984    87151     +167     
+ Misses      17920    17906      -14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

nathanielsimard · 2024-11-12T22:33:09Z

crates/burn-jit/src/ops/qtensor.rs

                        qparams: JitQuantizationParameters::new(
-                            q.scale.elem(),
-                            Some(q.offset.elem()),
+                            qparams.scale,
+                            qparams.offset,
                            device,


Do we still need to have the scale and the offset in the representation?

Good point! With these changes we could go a step further and have the QJitTensor keep the packed values w/ the appended qparams 🤔 we would need a new CubeType to achieve the same thing.

The kernel could probably use a refactor in a follow-up PR, there have been a lot of improvements in cubecl since 😅

laggui added 5 commits November 11, 2024 15:17

Remove quantization strategy from QFloat to use scheme instead (qpara…

2a0b6a3

…ms unknown at compile-time) Instead, the qparams are stored in the TensorData bytes so we can pack/unpack them based on the scheme

Change quantization tensor data representation to pack quantized data…

4ea1f20

… type into u32

Fix clippy

3fa8773

Remove comment

1008323

Add alloc vec import

0a873dc

Remove print

a88227f

laggui changed the title ~~Remove quantization tensor data representation~~ Refactor quantization tensor data representation Nov 12, 2024

Rename into_bytes

3d7ca13

laggui marked this pull request as ready for review November 12, 2024 18:20

laggui requested a review from nathanielsimard November 12, 2024 18:20

nathanielsimard approved these changes Nov 12, 2024

View reviewed changes

laggui merged commit 34b303e into main Nov 13, 2024
11 checks passed

laggui deleted the refactor/quant-data branch November 13, 2024 15:46

laggui mentioned this pull request Dec 9, 2024

Refactor jit quantized tensor representation #2604

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor quantization tensor data representation #2479

Refactor quantization tensor data representation #2479

laggui commented Nov 11, 2024 •

edited

Loading

codecov bot commented Nov 11, 2024 •

edited

Loading

nathanielsimard Nov 12, 2024

laggui Nov 13, 2024 •

edited

Loading

Refactor quantization tensor data representation #2479

Refactor quantization tensor data representation #2479

Conversation

laggui commented Nov 11, 2024 • edited Loading

Checklist

Changes

Testing

codecov bot commented Nov 11, 2024 • edited Loading

Codecov Report

nathanielsimard Nov 12, 2024

Choose a reason for hiding this comment

laggui Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

laggui commented Nov 11, 2024 •

edited

Loading

codecov bot commented Nov 11, 2024 •

edited

Loading

laggui Nov 13, 2024 •

edited

Loading