-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpu: nvidia: conv: Fix int8 convolution primitive fails #1760
Conversation
Just a reminder that oneDNN v3.4 code freeze is coming on January 26. |
We have updated the PR. We follow the equation to apply the destination scaling after the post-operations. As |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another class of test cases that fail: no dst scale, but dst is s8:
./build/tests/benchdnn/benchdnn --conv --engine=gpu --skip-impl=ref --dir=FWD_I --dt=s8:s8:s8 --attr-scales=src:common:2 --attr-post-ops=sum mb1ic512iw121oc512ow122kw6pw3nconv1d:21
Even dst scale is not set, the post ops computation should happen in f32 and then the dst should be quantized but the dst_scale will be equal to 1 (default scale value).
I addressed this comment in the last commit. Just a kind reminder to take a look at it. Thanks @igorsafo |
@kala855 FYI, I pushed a minor change to reduce code duplication. |
Please squash all the changes and remove all |
@dzarukin I did the squash as you mentioned. Thank you very much. Let me know if something else needs to be done. |
Thanks for the contribution. |
Description
When destination scaling is applied to int8 data types some benchmarks fails. This happened because the scaling was applied over the int8 result causing saturation issues.
Fixes #1749
Solution:
We create a temporal vector to save the values in
f32
and keep the computation as expected by oneDNN. Thescratchpad_size
is modified because of this change. We launchcudnnConvolutionForward
to obtain its result asf32
. We apply the scaling parameters forsrc/wei
at this stage. After the post-operations, we apply the scaling fordst
over thef32
result to avoid saturation issues. Finally, the result is converted tos8
.Checklist
General
make test
andmake test_benchdnn_*
) pass locally for each commit?Bug fixes