Skip to content

Align _choose_qparams_affine with _choose_scale_float8 #3324

@jerryzh168

Description

@jerryzh168

In _choose_qparams_affine (for int):

we use keepdim=False:

min_val = torch.amin(input, dim=reduction_dims, keepdim=False)
max_val = torch.amax(input, dim=reduction_dims, keepdim=False)
that will make per tensor quantization to have a scalar scale automatically.

I think we can change _choose_qparams_affine to use keepdim=True and make sure we match the scale/zero_point with dimensions of input, so we don't need to do this:

# Reshape scale and zero_point to be compatible with block_size
# This is asserted in IntxUnpackedToInt8Tensor's __init__
n_blocks = []
for i in range(len(block_size)):
assert qdata.shape[i] % block_size[i] == 0
n_blocks.append(qdata.shape[i] // block_size[i])
scale = scale.reshape(*n_blocks)
zero_point = zero_point.reshape(*n_blocks)

we could also remove the block_size argument from quantize_affine and dequantize_affine afterwards since the rank are already aligned.

we can also add some docs afterwards to both ops to clarify.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions