[Relay][Quantization] Extend FakeQuantizationToInteger to more ops #8241

mbrookhart · 2021-06-10T22:20:05Z

Adding more ops to support QAT BERT. I also refactored the tests for easier extensions.

I marked this as WIP because the result isn't matching tflite, I need to do some more comparisons over the next day or two to isolate the problem.

I'm opening the PR now because supporting ops with multiple outputs required refactoring how I handle quantization specific types. Bringing it into the make ir namespace allows me to do what I need to do, but it also opens up a question of where or not this could be more used more generally in other places for quantization?

cc @jroesch @masahi @anijain2305 @jwfromm

masahi · 2021-06-10T22:25:13Z

I marked this as WIP because the result isn't matching tflite

How big is the difference? We should evaluate some accuracy metric on a whole dataset.

mbrookhart · 2021-06-10T23:03:08Z

I've only tested random data so far, looking for a dataset I can test with tomorrow

mbrookhart · 2021-07-14T21:41:31Z

Sorry for the delay on this. I've hit some very subtle accuracy bugs on a BERT model I was using that made me doubt this. I see slightly differences (1.5% rmse) on all comparisons on that model (tflite->onnxruntime, tflite->tvm, tflite->tvm + this, tvm->onnx, etc) I think that might a complication to that model in particular.

python/tvm/relay/transform/fake_quantization_to_integer.py

mbrookhart · 2021-07-19T17:37:24Z

@jwfromm @masahi @jroesch @anijain2305 can you take a look?

jwfromm

A few nitpicks on comments but this overall looks excellent. The one thing that seems missing is a brief tutorial that explains the concept of fake quantization and shows how to import and run a fake quantized model. We can differ writing that tutorial for a later PR but we should try not to drop it.

python/tvm/relay/transform/fake_quantization_to_integer.py

src/relay/transforms/fake_quantization_to_integer.cc

…, move to header

fix pylint fix black black broke pylint oops on black

mbrookhart · 2021-08-03T05:34:09Z

@elvin-n @jwfromm @masahi @anijain2305 This is finally green after the CI chaos last week, anyone want to take another look before merging?

masahi · 2021-08-03T05:54:10Z

Thanks @mbrookhart @jwfromm @elvin-n

…pache#8241) * support scalars in quantize and requantize * Add affine type support for ops with multipe output, use it in concat, move to header * support new ops, refactor tests * add more binary ops fix pylint fix black black broke pylint oops on black * fix a typo in a branch and add a test that hits it * improve comments

mbrookhart force-pushed the BERT_FQ2I branch from ba175c8 to 5091642 Compare July 14, 2021 21:39

mbrookhart force-pushed the BERT_FQ2I branch 2 times, most recently from 3213e48 to e877fa7 Compare July 15, 2021 19:18

mbrookhart changed the title ~~[WIP][Relay][Quantization] Extend FakeQuantizationToInteger to more ops~~ [Relay][Quantization] Extend FakeQuantizationToInteger to more ops Jul 15, 2021

mbrookhart force-pushed the BERT_FQ2I branch from e877fa7 to 9c356f3 Compare July 16, 2021 19:30

elvin-n reviewed Jul 17, 2021

View reviewed changes

python/tvm/relay/transform/fake_quantization_to_integer.py Outdated Show resolved Hide resolved

elvin-n approved these changes Jul 19, 2021

View reviewed changes

mbrookhart force-pushed the BERT_FQ2I branch from 68f5954 to a16fca9 Compare July 20, 2021 16:11

mbrookhart requested review from areusch, comaniac, icemelon, jroesch, junrushao, merrymercy, tqchen, yzhliu and zhiics as code owners July 20, 2021 16:11

mbrookhart force-pushed the BERT_FQ2I branch from a16fca9 to 623a877 Compare July 21, 2021 16:30

mbrookhart requested review from anijain2305, Huyuwei, jwfromm, kazum, MarisaKirisame, siju-samuel, slyubomirsky and srkreddy1238 as code owners July 21, 2021 16:30

mbrookhart requested review from vinx13, wweic and ZihengJiang as code owners July 21, 2021 16:30

jwfromm approved these changes Jul 22, 2021

View reviewed changes

python/tvm/relay/transform/fake_quantization_to_integer.py Show resolved Hide resolved

src/relay/transforms/fake_quantization_to_integer.cc Show resolved Hide resolved

src/relay/transforms/fake_quantization_to_integer.cc Outdated Show resolved Hide resolved

mbrookhart force-pushed the BERT_FQ2I branch 3 times, most recently from c91982c to 97ffe0f Compare July 29, 2021 18:33

Matthew added 6 commits August 2, 2021 14:19

support scalars in quantize and requantize

b3d5378

Add affine type support for ops with multipe output, use it in concat…

c2306f3

…, move to header

support new ops, refactor tests

0bc295b

add more binary ops

f35a2ee

fix pylint fix black black broke pylint oops on black

fix a typo in a branch and add a test that hits it

107822f

improve comments

f7b57d3

mbrookhart force-pushed the BERT_FQ2I branch from 97ffe0f to f7b57d3 Compare August 2, 2021 20:20

masahi approved these changes Aug 3, 2021

View reviewed changes

masahi merged commit 38fe522 into apache:main Aug 3, 2021

mbrookhart deleted the BERT_FQ2I branch August 3, 2021 12:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay][Quantization] Extend FakeQuantizationToInteger to more ops #8241

[Relay][Quantization] Extend FakeQuantizationToInteger to more ops #8241

mbrookhart commented Jun 10, 2021

masahi commented Jun 10, 2021

mbrookhart commented Jun 10, 2021

mbrookhart commented Jul 14, 2021

mbrookhart commented Jul 19, 2021

jwfromm left a comment

mbrookhart commented Aug 3, 2021

masahi commented Aug 3, 2021

[Relay][Quantization] Extend FakeQuantizationToInteger to more ops #8241

[Relay][Quantization] Extend FakeQuantizationToInteger to more ops #8241

Conversation

mbrookhart commented Jun 10, 2021

masahi commented Jun 10, 2021

mbrookhart commented Jun 10, 2021

mbrookhart commented Jul 14, 2021

mbrookhart commented Jul 19, 2021

jwfromm left a comment

Choose a reason for hiding this comment

mbrookhart commented Aug 3, 2021

masahi commented Aug 3, 2021