Convert OPT MatMuls with quantized inputs to MatMulInteger #1585

natuan · 2023-05-26T15:21:27Z

This change enables converting the torch.bmm with two quantized inputs and non-quantized output to MatMulInteger. This procedure is mutually exclusive with the existing one for QATMatMul-based quantized matmuls.

The attached graph shows two MatMulInteger that are the results of this conversion on OPT-125m.

Note that the quantization of these MatMuls on OPT requires this PR.

dbogunowicz

Approving tentatively, will test it shortly.

dbogunowicz · 2023-05-29T10:54:07Z

Also, please update the PR description.

natuan · 2023-05-30T16:14:48Z

Also, please update the PR description.

Added the description

…NX_export_OPT_matmuls

anmarques

Looks good to me after the fix

src/sparseml/pytorch/sparsification/quantization/quantize_qat_export.py

natuan added 2 commits May 26, 2023 15:16

Convert OPT MatMuls with quantized inputs to MatMulInteger

1f0b25b

Quality check

acf3ce8

natuan requested review from bfineran, anmarques, dbogunowicz and a team May 26, 2023 16:33

dbogunowicz previously approved these changes May 29, 2023

View reviewed changes

Return conversion count

4f7246e

natuan dismissed dbogunowicz’s stale review via 4f7246e May 30, 2023 16:06

natuan and others added 6 commits May 31, 2023 14:35

Merge branch 'main' into ONNX_export_OPT_matmuls

da28f1b

Merge branch 'main' into ONNX_export_OPT_matmuls

29b2e74

Merge branch 'main' into ONNX_export_OPT_matmuls

089e719

Add Mul node to rescale the quantized output back to FP32

deb1040

Merge remote-tracking branch 'origin/ONNX_export_OPT_matmuls' into ON…

3612a4f

…NX_export_OPT_matmuls

Quality fixes

33f8591

anmarques approved these changes Jun 7, 2023

View reviewed changes

src/sparseml/pytorch/sparsification/quantization/quantize_qat_export.py Outdated Show resolved Hide resolved

bfineran approved these changes Jun 8, 2023

View reviewed changes

Merge branch 'main' into ONNX_export_OPT_matmuls

821018b

natuan merged commit 1575944 into main Jun 8, 2023

natuan deleted the ONNX_export_OPT_matmuls branch June 8, 2023 14:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert OPT MatMuls with quantized inputs to MatMulInteger #1585

Convert OPT MatMuls with quantized inputs to MatMulInteger #1585

natuan commented May 26, 2023 •

edited

Loading

dbogunowicz left a comment

dbogunowicz commented May 29, 2023

natuan commented May 30, 2023

anmarques left a comment

Convert OPT MatMuls with quantized inputs to MatMulInteger #1585

Convert OPT MatMuls with quantized inputs to MatMulInteger #1585

Conversation

natuan commented May 26, 2023 • edited Loading

dbogunowicz left a comment

Choose a reason for hiding this comment

dbogunowicz commented May 29, 2023

natuan commented May 30, 2023

anmarques left a comment

Choose a reason for hiding this comment

natuan commented May 26, 2023 •

edited

Loading