Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cachemask variant for fake_quantize_affine #500

Merged
merged 1 commit into from
Jul 17, 2024

Commits on Jul 16, 2024

  1. Add cachemask variant for fake_quantize_affine

    Summary: In QAT, we often wish to filter out the gradients
    corresponding to values outside the expected quantization
    range, for example:
    
    ```
    q = _quantize_affine_no_dtype_cast(...)
    dq = _dequantize_affine_no_dtype_check(...)
    mask = torch.logical_and((q >= quant_min), (q <= quant_max))
    
    grad = grad * mask
    ```
    
    The existing `fake_quantize_affine` returns the dequantized
    values only, so callers do not have access to this mask.
    This commit adds the variant to this op that returns both
    the dequantized values and the mask, similar to
    `fake_quantize_per_tensor_affine_cachemask` in core.
    
    Test Plan:
    python test/quantization/test_quant_primitives.py -k test_fake_quantize_affine_cachemask
    andrewor14 committed Jul 16, 2024
    Configuration menu
    Copy the full SHA
    d70f92c View commit details
    Browse the repository at this point in the history