-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCM] Support TF_ROCM_CLANG for builds with clang host compiler #2192
Conversation
retest Ubuntu-CPU please |
We'll also need to update the symlink of rocm.bazelrc to point to gpu.bazelrc (instead of gpu_gcc.bazelrc) |
Retest Ubuntu-GPU-single please |
I wonder what's situation for this PR? are we capable of switching to clang now? |
2ff01fa
to
c447595
Compare
Retest Ubuntu-CPU please. |
065bd06
to
2a4f1f7
Compare
Retest Ubuntu-GPU-multi please. |
6300e9a
to
cbdf070
Compare
Retest Ubuntu-CPU please. |
@jayfurmanek I think this one is ready. I slightly dislike having to change root .bazelrc in the last commit. Alternative is to create /userlocal/ in Dockerfile.rocm and point bazel in run_gpu_single/multi.sh to use --confing=/userlocal/gpu.bazelrc. |
@i-chaochen Where can I update the docker image for ThirdParty-XLA ci? |
ThirdParty-XLA CI is using the same tensorflow docker image ( |
It is missing clang-16, so the PR fails, and cannot be merged. |
I think you need to add clang-16 into these three:
|
|
be01b2d
to
98885a3
Compare
My bad. This CI is not blocking the merge. @jayfurmanek The only thing that is missing is the review. Should I include more people into reviewers? |
build:rocm --action_env=CLANG_COMPILER_PATH="/usr/lib/llvm-16/bin/clang" | ||
# Disable unused-result on rocm builds. | ||
build:rocm --copt="-Wno-error=unused-result" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're switching to use gpu.bazelrc, could you also add xla_cpp_filters from gpu_gcc.bazelrc ? Thanks!
@@ -15,6 +15,21 @@ build --action_env=CACHEBUSTER=565341047 | |||
# Build options for GPU Linux | |||
build --config=release_gpu_linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line pulls in the cuda config from the .bazelrc.
We'll need a "release_rocm_linux" config there I think..
build:rocm --define=tensorflow_mkldnn_contraction_kernel=0 | ||
build:rocm --repo_env TF_NEED_ROCM=1 | ||
build:rocm --action_env=TF_ROCM_CLANG="1" | ||
build:rocm --repo_env=BAZEL_COMPILER="/usr/lib/llvm-16/bin/clang" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update to clang-17 to match what's upstream now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will need to be changed in the container as well (merged separately)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update to clang-17 to match what's upstream now?
Thanks. Missed that. Will update.
@@ -15,6 +15,21 @@ build --action_env=CACHEBUSTER=565341047 | |||
# Build options for GPU Linux | |||
build --config=release_gpu_linux | |||
|
|||
# ROCM: Set up compilation ROCM version and paths | |||
build:rocm --linkopt="-fuse-ld=gold" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to use ldd for the linker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I had some issues getting lld on all docker images. I will have to check that.
c549c5f
to
43db4b8
Compare
Remove TF_NEED_CLANG workarounds
Test doceker doesn't have /userlocal/gpu.bazelrc so flip the switch in root .bazelrc.
43db4b8
to
b023dcd
Compare
retest gpu-pycpp please |
retest gpu-non-pip-multi please |
retest Ubuntu-GPU-multi please |
b023dcd
to
db77d4a
Compare
Retest Ubuntu-GPU-single please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Please tag the repo before and after merging:
pre-clang-merge
post-clang-merge
Remove TF_NEED_CLANG workarounds