Skip to content
This repository has been archived by the owner on Jan 26, 2024. It is now read-only.

SIGSEGV in Device::setupCpuAgent #17

Open
samcv opened this issue Aug 25, 2020 · 3 comments
Open

SIGSEGV in Device::setupCpuAgent #17

samcv opened this issue Aug 25, 2020 · 3 comments

Comments

@samcv
Copy link

samcv commented Aug 25, 2020

When running clinfo (system or ROCm version) I get a segmentation fault. This is on tag rocm-3.7.0. My system is OpenSuse Tumbleweed. This happens if I install from the official ROCm SUSE repository, or if I compile rocclr and rocm-opencl-runtime myself the issue persists (with other packages as the official packages). Compiling those two with debug is how I got the nice backtrace.

This is with AMD Ryzen 7 PRO 2700U Vega using the APU (integrated graphics).

Thread 1 "clinfo" received signal SIGSEGV, Segmentation fault.
0x00007ffff7923266 in roc::Device::setupCpuAgent (this=0x4f3870) at /home/samantha/git/ROCm/ROCclr/device/rocm/rocdevice.cpp:170
170       cpu_agent_ = cpu_agents_[index].agent;
(gdb) bt
#0  0x00007ffff7923266 in roc::Device::setupCpuAgent (this=0x4f3870) at /home/samantha/git/ROCm/ROCclr/device/rocm/rocdevice.cpp:170
#1  0x00007ffff7926840 in roc::Device::populateOCLDeviceConstants (this=0x4f3870) at /home/samantha/git/ROCm/ROCclr/device/rocm/rocdevice.cpp:1047
#2  0x00007ffff7924d08 in roc::Device::create (this=0x4f3870) at /home/samantha/git/ROCm/ROCclr/device/rocm/rocdevice.cpp:593
#3  0x00007ffff79245a1 in roc::Device::init () at /home/samantha/git/ROCm/ROCclr/device/rocm/rocdevice.cpp:489
#4  0x00007ffff78b7390 in amd::Device::init () at /home/samantha/git/ROCm/ROCclr/device/device.cpp:194
#5  0x00007ffff78d2a53 in amd::Runtime::init () at /home/samantha/git/ROCm/ROCclr/platform/runtime.cpp:74
#6  0x00007ffff78a318b in ShouldLoadPlatform () at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/amdocl/cl_icd.cpp:222
#7  0x00007ffff78a3287 in operator() (__closure=0x7fffffffcdaf) at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/amdocl/cl_icd.cpp:272
#8  0x00007ffff78a366b in std::__invoke_impl<void, clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(std::__invoke_other, struct {...} &&) (__f=...) at /usr/include/c++/10/bits/invoke.h:60
#9  0x00007ffff78a363a in std::__invoke<clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(struct {...} &&) (__fn=...) at /usr/include/c++/10/bits/invoke.h:95
#10 0x00007ffff78a3531 in operator() (this=0x7fffffffcd50) at /usr/include/c++/10/mutex:717
#11 0x00007ffff78a355b in operator() (this=0x0) at /usr/include/c++/10/mutex:722
#12 0x00007ffff78a356c in _FUN () at /usr/include/c++/10/mutex:722
#13 0x00007ffff7c2e36f in __pthread_once_slow () from /lib64/libpthread.so.0
#14 0x00007ffff78a3084 in __gthread_once (__once=0x7ffff7a4b050 <clIcdGetPlatformIDsKHR::initOnce>, __func=0x7ffff7e754e0 <std::__once_proxy()>) at /usr/include/c++/10/x86_64-suse-linux/bits/gthr-default.h:700
#15 0x00007ffff78a35f2 in std::call_once<clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> >(std::once_flag &, struct {...} &&) (__once=..., __f=...) at /usr/include/c++/10/mutex:729
#16 0x00007ffff78a32e1 in clIcdGetPlatformIDsKHR (num_entries=0, platforms=0x0, num_platforms=0x7fffffffcdec) at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/amdocl/cl_icd.cpp:272
#17 0x00007ffff7fc236d in khrIcdVendorAdd (libraryName=0x442110 "libamdocl64.so") at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/khronos/icd/loader/icd.c:87
#18 0x00007ffff7fc5d3e in khrIcdOsVendorsEnumerate () at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/khronos/icd/loader/linux/icd_linux.c:125
#19 0x00007ffff7c2e36f in __pthread_once_slow () from /lib64/libpthread.so.0
#20 0x00007ffff7fc5dcc in khrIcdOsVendorsEnumerateOnce () at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/khronos/icd/loader/linux/icd_linux.c:149
#21 0x00007ffff7fc2272 in khrIcdInitialize () at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/khronos/icd/loader/icd.c:31
#22 0x00007ffff7fc27c3 in clGetPlatformIDs (num_entries=0, platforms=0x0, num_platforms=0x7fffffffcfa0) at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/khronos/icd/loader/icd_dispatch.c:34
#23 0x0000000000408216 in cl::Platform::get (platforms=0x7fffffffd370) at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/khronos/headers/opencl2.2/CL/cl2.hpp:2482
#24 0x0000000000403813 in main (argc=1, argv=0x7fffffffda98) at /home/samantha/git/ROCm/ROCm-OpenCL-Runtime/tools/clinfo/clinfo.cpp:75
@vsytch
Copy link
Contributor

vsytch commented Sep 8, 2020

This has been fixed internally and the issue should be gone with the next ROCm release.

@samcv
Copy link
Author

samcv commented Sep 8, 2020

@vsytch good news. Is there a patch you can provide to fix this issue?

@faust3
Copy link

faust3 commented Oct 3, 2020

@samcv

I had the same problem with Ryzen 2400G and Ubuntu 20.04.

I don't know how the internal fix looks like but following change prevented the crash for me.

rocmPatch.patch.txt

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants