Add binary `remainder` for tensor #2427

med1844 · 2024-10-27T03:08:47Z

Pull Request Template

Checklist

Confirmed that run-checks all script has been executed.
Made sure the book is up to date with changes in this PR.
- Note: The burn book already documents remainder without specifying it's only remainder_scalar. This PR implements the expected behavior, so no book changes needed.

Related Issues/PRs

Changes

Added binary remainder to existing backends, including tests & autodiff.

The operation follows PyTorch's behavior where the remainder has the same sign as the divisor (e.g., 2.0 % -1.5 = -1.0), rather than Rust's default behavior (2.0 % -1.5 = 0.5).

Testing

Added tests for burn-candle, burn-ndarray, burn-autodiff, and burn-tensor.

Questions

How to display Instruction
https://github.com/tracel-ai/burn/pull/1597/files#diff-f71eb223e26e9a2944748c2e095167cb4ed9e7fcb6ccfda110c6c3f3869eab99R258-R259
Currently, remainder_scalar is implemented as ((lhs % rhs) + rhs) % rhs, whereas remainder is implemented as lhs - (lhs / rhs).floor() * rhs. I'm not sure which to display.
Quantization tests
I tried to add tests for quantized tensors (also for rounding functions, see Add round, floor, ceil for float tensor #2372) but couldn't figure out how to generate quantized tensors exactly the same as existing tests, please shed some light on this!
Update on segfault when running run-checks.sh std
Turns out it's caused by bridge::byte::tests::should_support_dual_byte_bridge when trying to create a tensor on device2, which panics: cubecl-wgpu/src/runtime.rs:297:17: No adapter found for graphics API AutoGraphicsApi. Not sure how to solve this issue.

codecov · 2024-10-27T03:37:51Z

Codecov Report

Attention: Patch coverage is 82.06785% with 111 lines in your changes missing coverage. Please review.

Project coverage is 82.79%. Comparing base (69de0ef) to head (300d392).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
crates/burn-fusion/src/ops/int.rs	0.00%	26 Missing ⚠️
crates/burn-router/src/ops/op_int.rs	0.00%	18 Missing ⚠️
crates/burn-candle/src/ops/tensor.rs	0.00%	14 Missing ⚠️
crates/burn-candle/src/ops/int_tensor.rs	0.00%	12 Missing ⚠️
crates/burn-tch/src/ops/int_tensor.rs	0.00%	12 Missing ⚠️
crates/burn-tch/src/ops/base.rs	0.00%	9 Missing ⚠️
crates/burn-tensor/src/tensor/api/numeric.rs	80.00%	4 Missing ⚠️
crates/burn-autodiff/src/ops/int_tensor.rs	0.00%	3 Missing ⚠️
crates/burn-jit/src/ops/int_ops.rs	0.00%	3 Missing ⚠️
crates/burn-ndarray/src/ops/int_tensor.rs	0.00%	3 Missing ⚠️
... and 3 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2427      +/-   ##
==========================================
- Coverage   82.79%   82.79%   -0.01%     
==========================================
  Files         809      810       +1     
  Lines      104191   104804     +613     
==========================================
+ Hits        86270    86773     +503     
- Misses      17921    18031     +110

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

laggui

Thanks for adding the operation 🙂

Implementation looks good overall, just a few comments below.

To answer your questions...

How to display Instruction

I don't believe this is required for your PR. All of the compiler stuff mentioned from the linked PR has been moved to cubecl anyway.

Quantization tests
I tried to add tests for quantized tensors (also for rounding functions, see #2372) but couldn't figure out how to generate quantized tensors exactly the same as existing tests, please shed some light on this!

The values were manually calculated to represent the float tensors with the correct quantized values and parameters. In your tests, you can reuse the existing values for the inputs and, as with the other tests, check for equality on the dequantized values. That way the tests are easy to validate.

Take the exp() test for example:

#[test]
fn should_support_exp_ops() {
    // Quantized [[0.0, 1.0, 2.0], [3.0, 4.0, 5.0]]
    // NOTE: we use affine quantization to reduce quantization errors since `exp()` amplifies the error
    let data = TensorData::quantized(
        vec![-128i8, -77, -26, 25, 76, 127],
        [2, 3],
        QuantizationStrategy::PerTensorAffineInt8(AffineQuantization::init(0.019607844, -128)),
    );
    let tensor = TestTensor::<2>::from_data(data, &Default::default());

    let output = tensor.exp();
    let expected = TensorData::from([[1.0, 2.71830, 7.3891], [20.0855, 54.5981, 148.4132]]);

    // Precision 1 to approximate de/quantization errors
    output
        .dequantize()
        .into_data()
        .assert_approx_eq(&expected, 1);
}

Update on segfault when running run-checks.sh std
Turns out it's caused by bridge::byte::tests::should_support_dual_byte_bridge when trying to create a tensor on device2, which panics: cubecl-wgpu/src/runtime.rs:297:17: No adapter found for graphics API AutoGraphicsApi. Not sure how to solve this issue.

Ahhh this might just be due to your setup. For the multi-backend we use ndarray and wgpu as test backends, but wgpu might not be able to detect any device (possibly missing drivers) for your machine. Could be fixed if you install vulkan drivers, but otherwise you could ignore the failure locally and let it run on CI.

crates/burn-tensor/src/tensor/api/numeric.rs

crates/burn-tensor/src/tests/ops/remainder.rs

med1844 · 2024-10-31T16:37:45Z

Thank you for the extra info on the quantization part! I have been able to produce the exact same output using python:

import numpy as np
import numpy.typing as npt
from dataclasses import dataclass


@dataclass
class QuantizedTensor:
    quantized_tensor: npt.NDArray[np.int8]
    scale: float
    zero_point: int

    @classmethod
    def from_tensor(
        cls, in_tensor: npt.NDArray[np.float32] | list[float | int]
    ) -> "QuantizedTensor":
        dtype = np.int8
        tensor = np.array(in_tensor, dtype=np.float32)
        qmin, qmax = np.iinfo(dtype).min, np.iinfo(dtype).max
        vmin, vmax = tensor.min(), tensor.max()

        scale = (vmax - vmin) / (qmax - qmin)
        v_zero_point = -((vmax - vmin) / 2 + vmin)
        q_zero_point = (qmax - qmin) / 2 + qmin
        zero_point = round(v_zero_point / scale + q_zero_point)

        if not qmin <= zero_point <= qmax:
            zero_point = min(qmax, max(qmin, zero_point))
            oob_tensor = np.round(tensor / scale + zero_point)
            omin, omax = min(oob_tensor.min(), qmin), max(oob_tensor.max(), qmax)
            scale /= (qmax - qmin) / (omax - omin)

        quantized_values = (
            np.round(tensor / scale + zero_point).clip(qmin, qmax).astype(dtype)
        )

        return cls(quantized_values, scale, int(zero_point))

    def to_tensor(self) -> npt.NDArray[np.float32]:
        return (self.quantized_tensor.astype(np.float32) - self.zero_point) * self.scale


if __name__ == "__main__":
    quant = QuantizedTensor.from_tensor(list(map(float, range(6))))
    print(quant)
    print(quant.to_tensor())

Which prints:

QuantizedTensor(quantized_tensor=array([-128,  -77,  -26,   25,   76,  127], dtype=int8), scale=0.0196078431372549, zero_point=-128.0)
[0. 1. 2. 3. 4. 5.]

I will use this to generate tests. This might also help other contributors in case in the future they are also adding new operators.

Edit: turns out zero_point is int. Modified the code to round zero_point
Edit 2: turns out zero_point is i8. Modified the code to recalculate scale.

laggui · 2024-11-01T12:36:50Z

If you keep the same quantized input values as one that is already generated for an existing test, you can assume that the dequantized output values should be approximately equal to the floating point results for the operation. So the reference can be kept as float and you don't necessarily need to dig into quantization, but it is a good learning experience 😄

med1844 · 2024-11-04T02:02:35Z

I have added quantized tests for new operators (remainder, roundings). Maybe they should be in a separate PR, idk.

I have also looked at previous CI checks. Seems that remainder is not working properly on Mac, it returns the input when it should return pure 0. I don't own any Mac thus have no idea how to debug this. Any help on this would be great!

laggui

Just missing a few things, mostly some stuff in the added quantization tests.

Regarding the macos CI failing.. I'm not quite sure why honestly 🤔 it's failing for the candle implementation of the op but it looks good. A bit weird, though I also do not own a mac to check this out further.

crates/burn-ndarray/src/ops/base.rs

crates/burn-tensor/src/tests/quantization/ops/ceil.rs

crates/burn-tensor/src/tests/quantization/ops/floor.rs

crates/burn-tensor/src/tests/quantization/ops/remainder.rs

… symmetric on certain tests

med1844 · 2024-11-08T17:30:21Z

I have updated the code according to the comments. Please review again, thank you.
For the MacOS CI failure, maybe we should find someone who has Apple hardware to replicate and figure out what happens; or we could just add to the book that the remainder operator may not work properly on Mac.

laggui

LGTM! Thanks for addressing 🙏

Regarding the macos test failure, it appears to always fail with candle so perhaps we could simply disable the test there specifically.

Once this is addressed it'll be good to merge!

crates/burn-candle/src/lib.rs

med1844 · 2024-11-09T23:03:26Z

I have commented out both of them and CI is finally happy. Please review, thank you!

laggui

Thank you 🎉

med1844 added 14 commits October 26, 2024 19:35

Add remainder definition

f94e509

Add remainder for float ops trait

f12e20b

Impl remainder for ndarray

6d5ad99

Impl remainder for autodiff

9f88cbc

Impl remainder for candle

7196832

Add empty impl for all backends, add remainder test for tensor

3a81e45

Add remainder for tch

375a53b

Add remainder for float NdArray tensor

030a141

Add remainder for jit

9ecffdd

Add remainder into bin op desc; add remainder for burn-fusion

0928ec8

Add remainder for burn-router

3d1bddd

Fix ndarray remainder impl to match python output

421e84c

Add autodiff test for remainder

f7a126a

Add remainder test for candle

18b8aff

laggui reviewed Oct 29, 2024

View reviewed changes

crates/burn-tensor/src/tensor/api/numeric.rs Outdated Show resolved Hide resolved

crates/burn-tensor/src/tensor/api/numeric.rs Outdated Show resolved Hide resolved

crates/burn-tensor/src/tests/ops/remainder.rs Outdated Show resolved Hide resolved

med1844 added 3 commits October 30, 2024 21:12

remove should impl trait attr for remainder & _scalar

b5fe148

fix Numeric trait remainder doc wordings

398582c

split remainder and remainder_scalar tests

d75894d

med1844 added 2 commits November 3, 2024 11:11

add quantized tests for remainder

b3837d5

add quantized tests for rounding fns

3c5cd20

laggui requested changes Nov 4, 2024

View reviewed changes

med1844 added 3 commits November 5, 2024 09:09

add original float values for rounding fns

e4896af

remove ndarray remainder dead code

def688d

use unquantized tensor remainder result as expected result; switch to…

b014eee

… symmetric on certain tests

laggui reviewed Nov 8, 2024

View reviewed changes

crates/burn-candle/src/lib.rs Outdated Show resolved Hide resolved

crates/burn-candle/src/lib.rs Outdated Show resolved Hide resolved

med1844 and others added 3 commits November 9, 2024 08:37

comment out testgen_remainder

ba4577e

comment out testgen_ad_remainder in ndarray

02ad1a7

Merge branch 'tracel-ai:main' into feat/ops/remainder

300d392

laggui approved these changes Nov 11, 2024

View reviewed changes

laggui merged commit 2d874ab into tracel-ai:main Nov 11, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add binary `remainder` for tensor #2427

Add binary `remainder` for tensor #2427

med1844 commented Oct 27, 2024

codecov bot commented Oct 27, 2024 •

edited

Loading

laggui left a comment

med1844 commented Oct 31, 2024 •

edited

Loading

laggui commented Nov 1, 2024

med1844 commented Nov 4, 2024

laggui left a comment

med1844 commented Nov 8, 2024

laggui left a comment •

edited

Loading

med1844 commented Nov 9, 2024

laggui left a comment

Add binary remainder for tensor #2427

Add binary remainder for tensor #2427

Conversation

med1844 commented Oct 27, 2024

Pull Request Template

Checklist

Related Issues/PRs

Changes

Testing

Questions

codecov bot commented Oct 27, 2024 • edited Loading

Codecov Report

laggui left a comment

Choose a reason for hiding this comment

med1844 commented Oct 31, 2024 • edited Loading

laggui commented Nov 1, 2024

med1844 commented Nov 4, 2024

laggui left a comment

Choose a reason for hiding this comment

med1844 commented Nov 8, 2024

laggui left a comment • edited Loading

Choose a reason for hiding this comment

med1844 commented Nov 9, 2024

laggui left a comment

Choose a reason for hiding this comment

Add binary `remainder` for tensor #2427

Add binary `remainder` for tensor #2427

codecov bot commented Oct 27, 2024 •

edited

Loading

med1844 commented Oct 31, 2024 •

edited

Loading

laggui left a comment •

edited

Loading