Skip to content

Adding rocBLAS support for BLAS domain through Intel DPCPP compiler with HIP backend support#189

Merged
mmeterel merged 10 commits intouxlfoundation:developfrom
TejaX-Alaghari:rocblas_hip_support
Sep 30, 2022
Merged

Adding rocBLAS support for BLAS domain through Intel DPCPP compiler with HIP backend support#189
mmeterel merged 10 commits intouxlfoundation:developfrom
TejaX-Alaghari:rocblas_hip_support

Conversation

@TejaX-Alaghari
Copy link
Contributor

@TejaX-Alaghari TejaX-Alaghari commented Apr 25, 2022

Description

This PR is intended to extend the HIP backend of BLAS domain, which is currently supported for hipSYCL compiler to Intel’s DPC++/SYCL compiler.

Checklist

All Submissions

  • Do all unit tests pass locally? hipSYCL.log llvm.log
  • Have you formatted the code using clang-format?

New features

  • Have you provided motivation for adding a new feature?
  • Have you added relevant tests?

@ajaypanyala
Copy link

@TejaX-Alaghari Thank you for this PR! I have built and successfully tested it on an application containing SYCL code. Just had to adjust the following line in FindrocBLAS.cmake to IMPORTED_LOCATION "${HIP_PATH}/../rocblas/lib/librocblas.so" in order to handle non-standard ROCM paths. Could you please incorporate this small fix before the PR gets merged? Also filed a related issue on the intel llvm page.

@TejaX-Alaghari
Copy link
Contributor Author

Thanks for the suggestion @ajaypanyala. I've added a commit for that change.
Do let me know if there are any other suggestions.

Copy link
Contributor

@mmeterel mmeterel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Looking forward to add this feature.

I added some initial feedback. General comments:

  1. Please check if the comments added for cuBLAS backend still makes sense here as well. And if yes, update them accordingly.
  2. Please use sycl:: namespace throughout the project.
  3. Please make sure header guards match header names.
  4. Please test rocBLAS backend with hipSYCL so make sure nothing is broken. Update logs.
  5. I have some concerns about Copyrights - will add my feedback later ==> feedback added
  6. Please update README and docs folder as needed.
  7. In the test logs, please test MKLCPU backend as well. Also, I see segfault errors. Can you fix them?

CMakeLists.txt Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain which deprecation warnings are being disabled here? An example would be useful.

Copy link
Contributor Author

@TejaX-Alaghari TejaX-Alaghari Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below is an example for the disabled warnings

/home/ubuntu/PR/oneMKL/src/blas/backends/rocblas/rocblas_task.hpp:57:58: warning: 'hip' is deprecated: use 'ext_oneapi_hip' instead [-Wdeprecated-declarations] hipStream_t stream = sycl::get_native<sycl::backend::hip>(queue); ^ /opt/compiler/llvm/build_3/include/sycl/CL/sycl/backend_types.hpp:34:7: note: 'hip' has been explicitly marked deprecated here hip __SYCL2020_DEPRECATED("use 'ext_oneapi_hip' instead") = ext_oneapi_hip, ^ /opt/compiler/llvm/build_3/include/sycl/CL/sycl/detail/defines_elementary.hpp:54:40: note: expanded from macro '__SYCL2020_DEPRECATED' #define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message) ^ /opt/compiler/llvm/build_3/include/sycl/CL/sycl/detail/defines_elementary.hpp:45:38: note: expanded from macro '__SYCL_DEPRECATED' #define __SYCL_DEPRECATED(message) [[deprecated(message)]]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use the recommended APIs instead of disabling the warnings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your suggestion. Mesut. The deprecated warning shown above could be fixed. The only concern is whether it breaks with hipSYCL config or not.

I'll explore the other kinds of deprecated warnings being thrown as well and fix them wherever possible after validating on hipSYCL as well.

@ajaypanyala
Copy link

ajaypanyala commented May 12, 2022

@TejaX-Alaghari @mmeterel I get a link error in the final step.

[100%] Linking CXX shared library ../../../../lib/libonemkl_blas_rocblas.so
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: cannot find -lclang_rt.builtins-x86_64

Not sure if something needs to be fixed in this file.

For now, I got it to work by manually adding -L$ROCM_PATH/llvm/lib/clang/13.0.0/lib/linux to the link line where libclang_rt.builtins-x86_64.a exists. The link step tries to find clang_rt.builtins-x86_64 in the folder $INTEL_LLVM_INSTALL_PATH/lib/clang/15.0.0/lib/linux which doesn't exist. Only the include folder $INTEL_LLVM_INSTALL_PATH/lib/clang/15.0.0/include exists. I am building the intel-llvm compiler as described here

[EDIT] This issue probably belongs in the intel-llvm repo. There is a related issue that was just opened.

@mmeterel
Copy link
Contributor

@TejaX-Alaghari @mmeterel I get a link error in the final step.

[100%] Linking CXX shared library ../../../../lib/libonemkl_blas_rocblas.so
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: cannot find -lclang_rt.builtins-x86_64

Not sure if something needs to be fixed in this file.

For now, I got it to work by manually adding -L$ROCM_PATH/llvm/lib/clang/13.0.0/lib/linux to the link line where libclang_rt.builtins-x86_64.a exists. The link step tries to find clang_rt.builtins-x86_64 in the folder $INTEL_LLVM_INSTALL_PATH/lib/clang/15.0.0/lib/linux which doesn't exist. Only the include folder $INTEL_LLVM_INSTALL_PATH/lib/clang/15.0.0/include exists. I am building the intel-llvm compiler as described here

[EDIT] This issue probably belongs in the intel-llvm repo. There is a related issue that was just opened.

@ajaypanyala @TejaX-Alaghari Thanks for the detailed explanation. I have seen the same issue earlier. I am leaning towards waiting for LLVM issue to be fixed, so we don't have to recommend users to use temporary hack. What do you think?

CMakeLists.txt Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please fix the alignment?

@ajaypanyala
Copy link

@TejaX-Alaghari @mmeterel I get a link error in the final step.

[100%] Linking CXX shared library ../../../../lib/libonemkl_blas_rocblas.so
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: cannot find -lclang_rt.builtins-x86_64

Not sure if something needs to be fixed in this file.
For now, I got it to work by manually adding -L$ROCM_PATH/llvm/lib/clang/13.0.0/lib/linux to the link line where libclang_rt.builtins-x86_64.a exists. The link step tries to find clang_rt.builtins-x86_64 in the folder $INTEL_LLVM_INSTALL_PATH/lib/clang/15.0.0/lib/linux which doesn't exist. Only the include folder $INTEL_LLVM_INSTALL_PATH/lib/clang/15.0.0/include exists. I am building the intel-llvm compiler as described here
[EDIT] This issue probably belongs in the intel-llvm repo. There is a related issue that was just opened.

@ajaypanyala @TejaX-Alaghari Thanks for the detailed explanation. I have seen the same issue earlier. I am leaning towards waiting for LLVM issue to be fixed, so we don't have to recommend users to use temporary hack. What do you think?

@mmeterel Sounds good. That would be perfect! Thanks!

@pengtu
Copy link

pengtu commented Aug 11, 2022

@TejaX-Alaghari @mmeterel I get a link error in the final step.

[100%] Linking CXX shared library ../../../../lib/libonemkl_blas_rocblas.so
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: cannot find -lclang_rt.builtins-x86_64

Not sure if something needs to be fixed in this file.
For now, I got it to work by manually adding -L$ROCM_PATH/llvm/lib/clang/13.0.0/lib/linux to the link line where libclang_rt.builtins-x86_64.a exists. The link step tries to find clang_rt.builtins-x86_64 in the folder $INTEL_LLVM_INSTALL_PATH/lib/clang/15.0.0/lib/linux which doesn't exist. Only the include folder $INTEL_LLVM_INSTALL_PATH/lib/clang/15.0.0/include exists. I am building the intel-llvm compiler as described here
[EDIT] This issue probably belongs in the intel-llvm repo. There is a related issue that was just opened.

@ajaypanyala @TejaX-Alaghari Thanks for the detailed explanation. I have seen the same issue earlier. I am leaning towards waiting for LLVM issue to be fixed, so we don't have to recommend users to use temporary hack. What do you think?

@mmeterel : According to the compiler team, SYCL compiler doesn't use clang-builtins library. It uses libgcc_s.so instead. In this case, the dependency for libclang_rt is from ROCM blas library. The clang version of libclang_rt should match the ROCM library built version. It suggests that adding -L$ROCM_PATH/llvm/lib/clang/13.0.0/lib/linux is the right permanent solution.

@mmeterel
Copy link
Contributor

@TejaX-Alaghari
Sorry for the delay in getting back to this. I have a few follow-up requests and questions about this PR.

  1. Could you please update each conversation with your feedback?
  2. I left 7 comments on May 11. Could you please respond to each of these comments with the status? Let's do our best to have every question/comment answered.

@mmeterel
Copy link
Contributor

mmeterel commented Sep 18, 2022

@TejaX-Alaghari
I tested this PR on an AMD system with ROCM version 5.0.0 and card version gfx90a.
LLVM with HIP backend is built from this commit: 66361038b63caaae566fc9648f5da50b74222b83

I am getting the warning/errors/crashes as shown in the attached files. I have tested only Level1 APIs for simplicity.

Could you please advise how to resolve them? Feel free to schedule a meeting to discuss in detail.

amd_llvm_level1.txt
amd_llvm_level1_verbose.txt

@TejaX-Alaghari
Copy link
Contributor Author

Thanks for the PR. Looking forward to add this feature.

I added some initial feedback. General comments:

  1. Please check if the comments added for cuBLAS backend still makes sense here as well. And if yes, update them accordingly.
  2. Please use sycl:: namespace throughout the project.
  3. Please make sure header guards match header names.
  4. Please test rocBLAS backend with hipSYCL so make sure nothing is broken. Update logs.
  5. I have some concerns about Copyrights - will add my feedback later ==> feedback added
  6. Please update README and docs folder as needed.
  7. In the test logs, please test MKLCPU backend as well. Also, I see segfault errors. Can you fix them?

Thanks for the detailed feedback on the PR. Please find below, the response for above comments:

  1. The existing comments are relevant for the changes introduced in this PR
  2. All the earlier "cl::sycl::"namespaces have been updated to "sycl::"
  3. All header guards match their respective file names
  4. TBD
  5. Updated the copyrights per the suggestion for newly introduced files
  6. Updated README and docs with the build configuration for rocBLAS backend through clang++
  7. We haven't observed any issues while testing MKLCPU backend for BLAS with "ctest"

@mmeterel
Copy link
Contributor

Tested the PR again after latest commits. Still seeing major failures. Please reach out for a debug session.

@mmeterel
Copy link
Contributor

Tested the PR again after latest commits. Still seeing major failures. Please reach out for a debug session.

After discussing with @TejaX-Alaghari offline, this issue is most likely due to missing synchronization. Similar issue was fixed in #228 for cuBLAS backend. @TejaX-Alaghari will work on updating this PR. Please note that, similar updates will likely needs to be done for rocRAND (#218 ) and rocSOLVER (#208 ) PRs.

@mmeterel mmeterel requested a review from mkrainiuk September 30, 2022 04:55
@mmeterel
Copy link
Contributor

Attached are the results for my local tests using LLVM compiler:
rocblas_llvm_log.txt

I see three problems with these results:

  1. SCAL seg fault: Tracked in issue [rocBLAS - LLVM] seg fault in scal tests #233
  2. Warnings due to hipCtxGetCurrent: Tracked in issue [rocBLAS - LLVM] Deprecation warnings for hipCtxGetCurrent #234
  3. Warnings due to linking module: Tracked in [PI][HIP] warning: Linking two modules of different target triples: amdgcn-unknown-amdhsa and amdgcn-amd-amdhsa intel/llvm#6922

I also ran this PR using hipSYCL. Attached are the results:
rocblas_hipsycl_log.txt
The errors I see are intermittent and do not appear then we run gtest directly. I don't think, these are related to this PR.

Based on the above discussion, I don't have concerns with merging this PR. The mentioned issues can be tracked separately.

Thanks for all your efforts on enabling rocBLAS backend with LLVM-HIP compiler.

@mmeterel
Copy link
Contributor

@vmalia @mkrainiuk I just added my test results along with related issues. The PR is ok to merge IMHO. Could you please take a look from your side?

Copy link
Contributor

@mkrainiuk mkrainiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! It looks good to me with one minor question.

template <typename H, typename F>
static inline void host_task_internal(H &cgh, sycl::queue queue, F f) {
cgh.interop_task([f, queue](sycl::interop_handler ih) {
cgh.host_task([f, queue](sycl::interop_handle ih) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be affected by changes in DPC++ compiler for dropping host support?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkrainiuk In my tests with latest LLVM, I did not see any deprecation warnings related to host_task(). I guess, if/when the compiler drops support, we can switch to recommended API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

Comments