Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][Quantization] Extend FakeQuantizationToInteger to more ops #8241

Merged
merged 6 commits into from
Aug 3, 2021

Conversation

mbrookhart
Copy link
Contributor

Adding more ops to support QAT BERT. I also refactored the tests for easier extensions.

I marked this as WIP because the result isn't matching tflite, I need to do some more comparisons over the next day or two to isolate the problem.

I'm opening the PR now because supporting ops with multiple outputs required refactoring how I handle quantization specific types. Bringing it into the make ir namespace allows me to do what I need to do, but it also opens up a question of where or not this could be more used more generally in other places for quantization?

cc @jroesch @masahi @anijain2305 @jwfromm

@masahi
Copy link
Member

masahi commented Jun 10, 2021

I marked this as WIP because the result isn't matching tflite

How big is the difference? We should evaluate some accuracy metric on a whole dataset.

@mbrookhart
Copy link
Contributor Author

I've only tested random data so far, looking for a dataset I can test with tomorrow

@mbrookhart
Copy link
Contributor Author

Sorry for the delay on this. I've hit some very subtle accuracy bugs on a BERT model I was using that made me doubt this. I see slightly differences (1.5% rmse) on all comparisons on that model (tflite->onnxruntime, tflite->tvm, tflite->tvm + this, tvm->onnx, etc) I think that might a complication to that model in particular.

@mbrookhart mbrookhart force-pushed the BERT_FQ2I branch 2 times, most recently from 3213e48 to e877fa7 Compare July 15, 2021 19:18
@mbrookhart mbrookhart changed the title [WIP][Relay][Quantization] Extend FakeQuantizationToInteger to more ops [Relay][Quantization] Extend FakeQuantizationToInteger to more ops Jul 15, 2021
@mbrookhart
Copy link
Contributor Author

@jwfromm @masahi @jroesch @anijain2305 can you take a look?

Copy link
Contributor

@jwfromm jwfromm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nitpicks on comments but this overall looks excellent. The one thing that seems missing is a brief tutorial that explains the concept of fake quantization and shows how to import and run a fake quantized model. We can differ writing that tutorial for a later PR but we should try not to drop it.

@mbrookhart mbrookhart force-pushed the BERT_FQ2I branch 3 times, most recently from c91982c to 97ffe0f Compare July 29, 2021 18:33
@mbrookhart
Copy link
Contributor Author

@elvin-n @jwfromm @masahi @anijain2305 This is finally green after the CI chaos last week, anyone want to take another look before merging?

@masahi masahi merged commit 38fe522 into apache:main Aug 3, 2021
@masahi
Copy link
Member

masahi commented Aug 3, 2021

Thanks @mbrookhart @jwfromm @elvin-n

@mbrookhart mbrookhart deleted the BERT_FQ2I branch August 3, 2021 12:58
ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
…pache#8241)

* support scalars in quantize and requantize

* Add affine type support for ops with multipe output, use it in concat, move to header

* support new ops, refactor tests

* add more binary ops

fix pylint

fix black

black broke pylint

oops on black

* fix a typo in a branch and add a test that hits it

* improve comments
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
…pache#8241)

* support scalars in quantize and requantize

* Add affine type support for ops with multipe output, use it in concat, move to header

* support new ops, refactor tests

* add more binary ops

fix pylint

fix black

black broke pylint

oops on black

* fix a typo in a branch and add a test that hits it

* improve comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants