-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Relay][Quantization] Extend FakeQuantizationToInteger to more ops #8241
Conversation
How big is the difference? We should evaluate some accuracy metric on a whole dataset. |
I've only tested random data so far, looking for a dataset I can test with tomorrow |
Sorry for the delay on this. I've hit some very subtle accuracy bugs on a BERT model I was using that made me doubt this. I see slightly differences (1.5% rmse) on all comparisons on that model (tflite->onnxruntime, tflite->tvm, tflite->tvm + this, tvm->onnx, etc) I think that might a complication to that model in particular. |
3213e48
to
e877fa7
Compare
@jwfromm @masahi @jroesch @anijain2305 can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nitpicks on comments but this overall looks excellent. The one thing that seems missing is a brief tutorial that explains the concept of fake quantization and shows how to import and run a fake quantized model. We can differ writing that tutorial for a later PR but we should try not to drop it.
c91982c
to
97ffe0f
Compare
fix pylint fix black black broke pylint oops on black
@elvin-n @jwfromm @masahi @anijain2305 This is finally green after the CI chaos last week, anyone want to take another look before merging? |
Thanks @mbrookhart @jwfromm @elvin-n |
…pache#8241) * support scalars in quantize and requantize * Add affine type support for ops with multipe output, use it in concat, move to header * support new ops, refactor tests * add more binary ops fix pylint fix black black broke pylint oops on black * fix a typo in a branch and add a test that hits it * improve comments
…pache#8241) * support scalars in quantize and requantize * Add affine type support for ops with multipe output, use it in concat, move to header * support new ops, refactor tests * add more binary ops fix pylint fix black black broke pylint oops on black * fix a typo in a branch and add a test that hits it * improve comments
Adding more ops to support QAT BERT. I also refactored the tests for easier extensions.
I marked this as WIP because the result isn't matching tflite, I need to do some more comparisons over the next day or two to isolate the problem.
I'm opening the PR now because supporting ops with multiple outputs required refactoring how I handle quantization specific types. Bringing it into the make ir namespace allows me to do what I need to do, but it also opens up a question of where or not this could be more used more generally in other places for quantization?
cc @jroesch @masahi @anijain2305 @jwfromm