Skip to content

[SYCL][CUDA] Add bf16 builtins operating on storage types #5748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 14, 2022

Conversation

t4c1
Copy link
Contributor

@t4c1 t4c1 commented Mar 7, 2022

Add bf16 builtins operating on storage types. Partially implements https://github.com/intel/llvm/pull/5645/files for CUDA (only operations on storage types).

This PR includes a bugfix for some NVPTX intrinsics, which will also be pushed upstream.

Blocked by #5724.

Tests for this are in intel/llvm-test-suite#897.

mlychkov
mlychkov previously approved these changes Mar 10, 2022
Copy link
Contributor

@mlychkov mlychkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in llvm intrinsics LGTM.

Copy link
Contributor

@s-kanaev s-kanaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RT changes LGTM.

@t4c1 , should there be a test for this change?

@t4c1
Copy link
Contributor Author

t4c1 commented Mar 14, 2022

Which change do you have in mind? There are some test being added to the test suite (linked in PR description). Or do you mean something else needs testing?

s-kanaev
s-kanaev previously approved these changes Mar 14, 2022
Copy link
Contributor

@s-kanaev s-kanaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RT changes LGTM

@s-kanaev
Copy link
Contributor

There are some test being added to the test suite (linked in PR description).

@t4c1 , sorry, didn't notice that at first glance. Seem like that'll do.

@bader
Copy link
Contributor

bader commented Mar 14, 2022

I had to merge with the sycl branch to resolve the conflict with 53a9d54.

@t4c1
Copy link
Contributor Author

t4c1 commented Mar 14, 2022

Thanks.

@bader bader merged commit 413a9ef into intel:sycl Mar 14, 2022
@bader
Copy link
Contributor

bader commented Mar 14, 2022

@t4c1, could you fix these warnings, please?
From https://github.com/intel/llvm/runs/5544122115?check_suite_focus=true

/home/runner/work/llvm/llvm/build/include/sycl/ext/oneapi/bf16_storage_builtins.hpp:40:68: error: unused parameter 'x' [-Werror,-Wunused-parameter]
std::enable_if_t<detail::is_bf16_storage_type<T>::value, T> fabs(T x) {
                                                                   ^
/home/runner/work/llvm/llvm/build/include/sycl/ext/oneapi/bf16_storage_builtins.hpp:49:68: error: unused parameter 'x' [-Werror,-Wunused-parameter]
std::enable_if_t<detail::is_bf16_storage_type<T>::value, T> fmin(T x, T y) {
                                                                   ^
/home/runner/work/llvm/llvm/build/include/sycl/ext/oneapi/bf16_storage_builtins.hpp:49:73: error: unused parameter 'y' [-Werror,-Wunused-parameter]
std::enable_if_t<detail::is_bf16_storage_type<T>::value, T> fmin(T x, T y) {
                                                                        ^
/home/runner/work/llvm/llvm/build/include/sycl/ext/oneapi/bf16_storage_builtins.hpp:58:68: error: unused parameter 'x' [-Werror,-Wunused-parameter]
std::enable_if_t<detail::is_bf16_storage_type<T>::value, T> fmax(T x, T y) {
                                                                   ^
/home/runner/work/llvm/llvm/build/include/sycl/ext/oneapi/bf16_storage_builtins.hpp:58:73: error: unused parameter 'y' [-Werror,-Wunused-parameter]
std::enable_if_t<detail::is_bf16_storage_type<T>::value, T> fmax(T x, T y) {
                                                                        ^
/home/runner/work/llvm/llvm/build/include/sycl/ext/oneapi/bf16_storage_builtins.hpp:67:67: error: unused parameter 'x' [-Werror,-Wunused-parameter]
std::enable_if_t<detail::is_bf16_storage_type<T>::value, T> fma(T x, T y, T z) {
                                                                  ^
/home/runner/work/llvm/llvm/build/include/sycl/ext/oneapi/bf16_storage_builtins.hpp:67:72: error: unused parameter 'y' [-Werror,-Wunused-parameter]
std::enable_if_t<detail::is_bf16_storage_type<T>::value, T> fma(T x, T y, T z) {
                                                                       ^
/home/runner/work/llvm/llvm/build/include/sycl/ext/oneapi/bf16_storage_builtins.hpp:67:77: error: unused parameter 'z' [-Werror,-Wunused-parameter]
std::enable_if_t<detail::is_bf16_storage_type<T>::value, T> fma(T x, T y, T z) {
                                                                            ^
8 errors generated.

steffenlarsen pushed a commit that referenced this pull request Jun 30, 2022
This PR introduces full support of element wise operations in the cuda backend. `wi_data`, `get_matrix_fill`, and `joint_matrix.get_wi_data()` are introduced for portability with the Intel backend. In addition, in the CUDA backend users can call `joint_matrix.wi_marray` to access the marray that stores the WI owned elements of the matrix and perform optimized element wise operations using math functions that take marrays.
bfloat16 element wise operations support is also included and this PR adds bfloat16 scalar/marray impls replacing the existing uint16_t "storage type" implementations for fma, fmax, fmin, and fabs math functions. The bfloat16 fma_relu function impl has now been added directly in #5749.
The existing temporary uint16_t implementations (introduced in #5748 with unmerged tests intel/llvm-test-suite#897) have been removed, since these bfloat16 implementations replaces them.

Signed-off-by: jack.kirk <jack.kirk@codeplay.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants