Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor quantization tensor data representation #2479

Merged
merged 7 commits into from
Nov 13, 2024
Merged

Conversation

laggui
Copy link
Member

@laggui laggui commented Nov 11, 2024

Checklist

  • Confirmed that run-checks all script has been executed.

Changes

DType::QFloat no longer contains the QuantizationStrategy with quantization parameters as it is not known at compile-time. Instead, use the existing QuantizationScheme enum which only indicates the method and dtype.

Also changed the representation to pack the quantized values (as was previously done in burn-jit exclusively) and include the quantization parameters as the bytes.

Testing

Added unit tests for packed/unpacked quantization parameters.

…ms unknown at compile-time)

Instead, the qparams are stored in the TensorData bytes so we can pack/unpack them based on the scheme
Copy link

codecov bot commented Nov 11, 2024

Codecov Report

Attention: Patch coverage is 86.53295% with 47 lines in your changes missing coverage. Please review.

Project coverage is 82.95%. Comparing base (6e71aaf) to head (3d7ca13).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
crates/burn-tch/src/ops/qtensor.rs 0.00% 32 Missing ⚠️
crates/burn-tensor/src/tensor/data.rs 94.11% 11 Missing ⚠️
...ates/burn-tensor/src/tensor/quantization/scheme.rs 0.00% 2 Missing ⚠️
crates/burn-ndarray/src/ops/qtensor.rs 98.03% 1 Missing ⚠️
crates/burn-tensor/src/tensor/ops/qtensor.rs 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2479      +/-   ##
==========================================
+ Coverage   82.91%   82.95%   +0.03%     
==========================================
  Files         810      811       +1     
  Lines      104904   105057     +153     
==========================================
+ Hits        86984    87151     +167     
+ Misses      17920    17906      -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@laggui laggui changed the title Remove quantization tensor data representation Refactor quantization tensor data representation Nov 12, 2024
@laggui laggui marked this pull request as ready for review November 12, 2024 18:20
Comment on lines 46 to 49
qparams: JitQuantizationParameters::new(
q.scale.elem(),
Some(q.offset.elem()),
qparams.scale,
qparams.offset,
device,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to have the scale and the offset in the representation?

Copy link
Member Author

@laggui laggui Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! With these changes we could go a step further and have the QJitTensor keep the packed values w/ the appended qparams 🤔 we would need a new CubeType to achieve the same thing.

The kernel could probably use a refactor in a follow-up PR, there have been a lot of improvements in cubecl since 😅

@laggui laggui merged commit 34b303e into main Nov 13, 2024
11 checks passed
@laggui laggui deleted the refactor/quant-data branch November 13, 2024 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants