gpu: nvidia: conv: Fix int8 convolution primitive fails #1760

kala855 · 2023-11-28T10:55:02Z

Description

When destination scaling is applied to int8 data types some benchmarks fails. This happened because the scaling was applied over the int8 result causing saturation issues.

Fixes #1749

Solution:

We create a temporal vector to save the values in f32 and keep the computation as expected by oneDNN. The scratchpad_size is modified because of this change. We launch cudnnConvolutionForward to obtain its result as f32. We apply the scaling parameters for src/wei at this stage. After the post-operations, we apply the scaling for dst over the f32 result to avoid saturation issues. Finally, the result is converted to s8.

Checklist

General

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
Have you formatted the code using clang-format?

Bug fixes

Have you included information on how to reproduce the issue (either in a github issue or in this PR)?

src/gpu/nvidia/cudnn_convolution_impl.hpp

vpirogov · 2024-01-12T20:35:23Z

Just a reminder that oneDNN v3.4 code freeze is coming on January 26.

kala855 · 2024-01-16T14:24:15Z

We have updated the PR. We follow the equation to apply the destination scaling after the post-operations. As cudnnConvolutionForward does not support having its output in s32 format, we follow a process to obtain it as an f32 value when the inputs are s8. Then, the scaling parameters for src/wei are applied. Finally, after the post-operations, the scaling for dst is applied over the f32 value, which is then converted to s8.

src/gpu/nvidia/cudnn_convolution_impl.hpp

igorsafo

There is another class of test cases that fail: no dst scale, but dst is s8:
./build/tests/benchdnn/benchdnn --conv --engine=gpu --skip-impl=ref --dir=FWD_I --dt=s8:s8:s8 --attr-scales=src:common:2 --attr-post-ops=sum mb1ic512iw121oc512ow122kw6pw3nconv1d:21

Even dst scale is not set, the post ops computation should happen in f32 and then the dst should be quantized but the dst_scale will be equal to 1 (default scale value).

kala855 · 2024-04-03T21:33:42Z

There is another class of test cases that fail: no dst scale, but dst is s8: ./build/tests/benchdnn/benchdnn --conv --engine=gpu --skip-impl=ref --dir=FWD_I --dt=s8:s8:s8 --attr-scales=src:common:2 --attr-post-ops=sum mb1ic512iw121oc512ow122kw6pw3nconv1d:21

Even dst scale is not set, the post ops computation should happen in f32 and then the dst should be quantized but the dst_scale will be equal to 1 (default scale value).

I addressed this comment in the last commit. Just a kind reminder to take a look at it. Thanks @igorsafo

src/gpu/nvidia/cudnn_convolution_impl.hpp

igorsafo · 2024-04-24T21:04:24Z

@kala855 FYI, I pushed a minor change to reduce code duplication.

dzarukin · 2024-05-01T04:32:14Z

Please squash all the changes and remove all merge commits - the history must be linear.

kala855 · 2024-05-02T13:19:47Z

@dzarukin I did the squash as you mentioned. Thank you very much. Let me know if something else needs to be done.

dzarukin · 2024-05-05T01:13:21Z

Thanks for the contribution.

kala855 requested a review from dzarukin November 28, 2023 10:55

igorsafo reviewed Nov 28, 2023

View reviewed changes

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

vpirogov added this to the v3.4 milestone Jan 12, 2024

kala855 commented Jan 16, 2024

View reviewed changes

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

igorsafo reviewed Jan 17, 2024

View reviewed changes

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

sgeor255 requested a review from igorsafo January 25, 2024 12:08

igorsafo reviewed Jan 25, 2024

View reviewed changes

src/gpu/nvidia/cudnn_convolution_impl.hpp Show resolved Hide resolved

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

src/gpu/nvidia/cudnn_convolution_impl.hpp Show resolved Hide resolved

kala855 requested a review from igorsafo February 8, 2024 09:26

vpirogov modified the milestones: v3.4, v3.5 Feb 13, 2024

igorsafo reviewed Feb 21, 2024

View reviewed changes

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

kala855 requested a review from igorsafo February 27, 2024 09:03

igorsafo suggested changes Mar 22, 2024

View reviewed changes

kala855 requested a review from igorsafo March 26, 2024 08:55

vpirogov added the platform:gpu-nvidia Codeowner: @oneapi-src/onednn-gpu-nvidia label Mar 29, 2024

igorsafo reviewed Apr 12, 2024

View reviewed changes

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

igorsafo approved these changes Apr 18, 2024

View reviewed changes

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

src/gpu/nvidia/cudnn_convolution_impl.hpp Outdated Show resolved Hide resolved

src: gpu: nvidia: conv: Fix int8 convolution primitive fails

472acde

kala855 force-pushed the kala855/1749 branch from 458047b to 472acde Compare May 2, 2024 12:02

dzarukin merged commit a986231 into main May 5, 2024
11 checks passed

dzarukin deleted the kala855/1749 branch May 5, 2024 01:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu: nvidia: conv: Fix int8 convolution primitive fails #1760

gpu: nvidia: conv: Fix int8 convolution primitive fails #1760

kala855 commented Nov 28, 2023 •

edited

Loading

vpirogov commented Jan 12, 2024

kala855 commented Jan 16, 2024

igorsafo left a comment

kala855 commented Apr 3, 2024

igorsafo commented Apr 24, 2024

dzarukin commented May 1, 2024

kala855 commented May 2, 2024

dzarukin commented May 5, 2024

gpu: nvidia: conv: Fix int8 convolution primitive fails #1760

gpu: nvidia: conv: Fix int8 convolution primitive fails #1760

Conversation

kala855 commented Nov 28, 2023 • edited Loading

Description

Solution:

Checklist

General

Bug fixes

vpirogov commented Jan 12, 2024

kala855 commented Jan 16, 2024

igorsafo left a comment

Choose a reason for hiding this comment

kala855 commented Apr 3, 2024

igorsafo commented Apr 24, 2024

dzarukin commented May 1, 2024

kala855 commented May 2, 2024

dzarukin commented May 5, 2024

kala855 commented Nov 28, 2023 •

edited

Loading