-
Notifications
You must be signed in to change notification settings - Fork 99
Description
Issue type
Build/Install
Have you reproduced the bug with TensorFlow Nightly?
No
Source
source
TensorFlow version
2.15
Custom code
No
OS platform and distribution
Linux Ubuntu 22.04 LTS
Mobile device
No response
Python version
3.10
Bazel version
6.1.0
GCC/compiler version
11.4
CUDA/cuDNN version
No response
GPU model and memory
gfx1100 & gfx1036
Current behavior?
I am trying to compile from source, but fails as it tries to compile for gfx1036 too.
I dont want that, i just want the gfx1100 version, but i am unable to disable this.
I tried to disable the iGPU in the bios, but it still shows up to rocminfo and apparently also to the compilation process.
Is there a way to force it to not try to compile for other gfx versions? i just want the gfx1100 version.
I also cant use the prebuild binary, since it contains the "gfx1030gfx1100" bug string, which causes tf to ignore my gpu.
I need a way to disable the iGPU so that rocm does not see it anymore.
this issue is similar to #2292 but i cant find a way to skip this gpu.
Standalone code to reproduce the issue
compile the latest version from source, with a gfx1036 on the system.Relevant log output
INFO: Found applicable config definition build:dynamic_kernels in file /home/user/custom_tf/tensorflow-upstream/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
WARNING: The following configs were expanded more than once: [rocm, rocm_base, no_tfrt, release_cpu_linux_base]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
INFO: Analyzed target //tensorflow/tools/pip_package:wheel (710 packages loaded, 50788 targets configured).
INFO: Found 1 target...
ERROR: /home/user/.cache/bazel/_bazel_user/8ff3c252cf6943b0e4c6e47a965a8647/external/local_xla/xla/service/gpu/BUILD:1412:23: Compiling xla/service/gpu/cub_sort_kernel.cu.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target @local_xla//xla/service/gpu:cub_sort_kernel_f64)
(cd /home/user/.cache/bazel/_bazel_user/8ff3c252cf6943b0e4c6e47a965a8647/execroot/org_tensorflow && \
exec env - \
CLANG_COMPILER_PATH=/usr/lib/llvm-17/bin/clang \
PATH=/home/user/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin \
PWD=/proc/self/cwd \
PYTHON_BIN_PATH=/usr/bin/python3 \
PYTHON_LIB_PATH=/usr/lib/python3/dist-packages \
ROCM_PATH=/opt/rocm-6.0.2 \
TF2_BEHAVIOR=1 \
TF_ROCM_CLANG=1 \
external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections '-std=c++14' -MD -MF bazel-out/k8-opt/bin/external/local_xla/xla/service/gpu/_objs/cub_sort_kernel_f64/cub_sort_kernel.cu.pic.d '-frandom-seed=bazel-out/k8-opt/bin/external/local_xla/xla/service/gpu/_objs/cub_sort_kernel_f64/cub_sort_kernel.cu.pic.o' -fPIC '-DEIGEN_MAX_ALIGN_BYTES=64' -DEIGEN_ALLOW_UNALIGNED_SCALARS '-DEIGEN_USE_AVX512_GEMM_KERNELS=0' '-DTENSORFLOW_USE_ROCM=1' -DCUB_TYPE_F64 '-DBAZEL_CURRENT_REPOSITORY="local_xla"' -iquote external/local_xla -iquote bazel-out/k8-opt/bin/external/local_xla -iquote external/eigen_archive -iquote bazel-out/k8-opt/bin/external/eigen_archive -iquote external/local_config_cuda -iquote bazel-out/k8-opt/bin/external/local_config_cuda -iquote external/local_tsl -iquote bazel-out/k8-opt/bin/external/local_tsl -iquote external/local_config_rocm -iquote bazel-out/k8-opt/bin/external/local_config_rocm -Ibazel-out/k8-opt/bin/external/local_config_cuda/cuda/_virtual_includes/cuda_headers_virtual -isystem external/eigen_archive -isystem bazel-out/k8-opt/bin/external/eigen_archive -isystem external/eigen_archive/mkl_include -isystem bazel-out/k8-opt/bin/external/eigen_archive/mkl_include -isystem external/local_config_cuda/cuda -isystem bazel-out/k8-opt/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-out/k8-opt/bin/external/local_config_cuda/cuda/cuda/include -isystem external/local_config_rocm/rocm -isystem bazel-out/k8-opt/bin/external/local_config_rocm/rocm -isystem external/local_config_rocm/rocm/rocm/include/hipcub -isystem bazel-out/k8-opt/bin/external/local_config_rocm/rocm/rocm/include/hipcub -isystem external/local_config_rocm/rocm/rocm/include/rocprim -isystem bazel-out/k8-opt/bin/external/local_config_rocm/rocm/rocm/include/rocprim -isystem external/local_config_rocm/rocm/rocm/include -isystem bazel-out/k8-opt/bin/external/local_config_rocm/rocm/rocm/include -isystem external/local_config_rocm/rocm/rocm/include/rocrand -isystem bazel-out/k8-opt/bin/external/local_config_rocm/rocm/rocm/include/rocrand -isystem external/local_config_rocm/rocm/rocm/include/roctracer -isystem bazel-out/k8-opt/bin/external/local_config_rocm/rocm/rocm/include/roctracer -Wno-all -Wno-extra -Wno-deprecated -Wno-deprecated-declarations -Wno-ignored-attributes -Wno-array-bounds -Wunused-result '-Werror=unused-result' -Wswitch '-Werror=switch' '-Wno-error=unused-but-set-variable' -DAUTOLOAD_DYNAMIC_KERNELS -Wno-gnu-offsetof-extensions -Wno-unused-result -Wno-sign-compare -Wno-gnu-offsetof-extensions -Wno-unused-result '-std=c++17' -x rocm '--amdgpu-target=gfx1100' -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' '-DTENSORFLOW_USE_ROCM=1' -D__HIP_PLATFORM_AMD__ -DEIGEN_USE_HIP -no-canonical-prefixes -fno-canonical-system-headers -c external/local_xla/xla/service/gpu/cub_sort_kernel.cu.cc -o bazel-out/k8-opt/bin/external/local_xla/xla/service/gpu/_objs/cub_sort_kernel_f64/cub_sort_kernel.cu.pic.o)
# Configuration: e4ece56677a12dcf02a4cc8466fa0e1a29e7ca5c7dc9c8d9b2f8ab0324debfef
# Execution platform: @local_execution_config_platform//:platform
clang: warning: argument unused during compilation: '-fgpu-flush-denormals-to-zero' [-Wunused-command-line-argument]
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr43 = V_MOV_B32_dpp undef $vgpr43(tied-def 0), $vgpr4, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr4 = V_MOV_B32_dpp undef $vgpr4(tied-def 0), killed $vgpr3, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr3 = V_MOV_B32_dpp undef $vgpr3(tied-def 0), $vgpr2, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr48 = V_MOV_B32_dpp undef $vgpr48(tied-def 0), $vgpr45, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr45 = V_MOV_B32_dpp undef $vgpr45(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr48 = V_MOV_B32_dpp undef $vgpr48(tied-def 0), $vgpr45, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr45 = V_MOV_B32_dpp undef $vgpr45(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr43 = V_MOV_B32_dpp undef $vgpr43(tied-def 0), $vgpr4, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr48 = V_MOV_B32_dpp undef $vgpr48(tied-def 0), $vgpr45, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr45 = V_MOV_B32_dpp undef $vgpr45(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr48 = V_MOV_B32_dpp undef $vgpr48(tied-def 0), $vgpr45, 322, 15, 15, 0, implicit $exec
error: Illegal instruction detected: Invalid dpp_ctrl value: broadcasts are not supported on GFX10+
renamable $vgpr45 = V_MOV_B32_dpp undef $vgpr45(tied-def 0), $vgpr44, 322, 15, 15, 0, implicit $exec
12 errors generated when compiling for gfx1036.
Target //tensorflow/tools/pip_package:wheel failed to build
INFO: Elapsed time: 177.097s, Critical Path: 81.30s
INFO: 6265 processes: 1446 internal, 4819 local.
FAILED: Build did NOT complete successfully