Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples segfault (Linux / mesa / amdgpu) #84

Closed
Kezii opened this issue Nov 1, 2020 · 5 comments
Closed

Examples segfault (Linux / mesa / amdgpu) #84

Kezii opened this issue Nov 1, 2020 · 5 comments

Comments

@Kezii
Copy link

Kezii commented Nov 1, 2020

I'm still trying to run this library

I am not able to run any of the examples, they fail with SEGV
My gpu:

Devices:
========
GPU0:
        apiVersion         = 4202627 (1.2.131)
        driverVersion      = 83894273 (0x5002001)
        vendorID           = 0x1002
        deviceID           = 0x67df
        deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
        deviceName         = AMD RADV POLARIS10 (ACO)
        driverID           = DRIVER_ID_MESA_RADV
        driverName         = radv
        driverInfo         = Mesa 20.2.1 (ACO)
        conformanceVersion = 1.2.3.0

Gdb points here: https://github.com/EthicalML/vulkan-kompute/blob/7b9f84f0abf1156fd2806957012093b19e126e22/single_include/kompute/Kompute.hpp#L1845

Apparently with a nvidia gpu it works, with the same aur package I made before

Steps to reproduce:

cd /tmp
git clone https://github.com/EthicalML/vulkan-kompute.git
cd vulkan-kompute/examples/array_multiplication/
cmake -Bbuild
make -C build -j12
build/kompute_array_mult

log: (I may have configured the logging library wrong?)

> ./kompute_array_mult
DEBUG: Kompute Manager creating instance
DEBUG: Kompute Manager Instance Created
DEBUG: Kompute Manager creating Device
INFO: Using physical device index {} found {}
DEBUG: Kompute Manager device created
DEBUG: Kompute Manager compute queue obtained
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1488] Kompute Manager createInitTensor triggered
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1490] Kompute Manager creating new tensor shared ptr
DEBUG: Kompute Tensor destructor started. Type: {}
DEBUG: Kompute Tensor destructor success
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1365] Kompute Manager evalOp Default triggered
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1334] Kompute Manager evalOp triggered
DEBUG: Kompute Manager creating Sequence object
DEBUG: Kompute Manager createManagedSequence with sequenceName: {} and queueIndex: {}
DEBUG: Kompute Sequence Constructor with existing device & queue
DEBUG: Kompute Sequence creating command pool
DEBUG: Kompute Sequence Command Pool Created
DEBUG: Kompute Sequence creating command buffer
DEBUG: Kompute Sequence Command Buffer Created
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1339] Kompute Manager evalOp running sequence BEGIN
DEBUG: Kompute sequence called BEGIN
INFO: Kompute Sequence command recording BEGIN
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1342] Kompute Manager evalOp running sequence RECORD
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1120] Kompute Sequence record function started
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1128] Kompute Sequence creating OpBase derived class instance
DEBUG: Compute OpBase constructor with params
DEBUG: Kompute OpTensorCreate constructor with params
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1139] Kompute Sequence running init on OpBase derived class instance
DEBUG: Kompute OpTensorCreate init called
DEBUG: Kompute Tensor running init with Vulkan params and num data elementS: {}
DEBUG: Kompute Tensor creating buffer
DEBUG: Kompute Tensor creating buffer with memory size: {}, and usage flags: {}
DEBUG: Kompute Tensor buffer created now creating memory
DEBUG: Kompute Tensor allocating memory index: {}, size {}, flags: {}
DEBUG: Kompute Tensor buffer & memory creation successful
DEBUG: Kompute Tensor running init with Vulkan params and num data elementS: {}
DEBUG: Kompute Tensor creating buffer
DEBUG: Kompute Tensor creating buffer with memory size: {}, and usage flags: {}
DEBUG: Kompute Tensor buffer created now creating memory
DEBUG: Kompute Tensor allocating memory index: {}, size {}, flags: {}
DEBUG: Kompute Tensor buffer & memory creation successful
DEBUG: Kompute Tensor local mapping tensor data to host buffer
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1143] Kompute Sequence running record on OpBase derived class instance
DEBUG: Kompute OpTensorCreate record called
DEBUG: Kompute Tensor recordCopyFrom called
DEBUG: Kompute Tensor copying data size {}.
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1345] Kompute Manager evalOp running sequence END
DEBUG: Kompute Sequence calling END
INFO: Kompute Sequence command recording END
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1348] Kompute Manager evalOp running sequence EVAL
DEBUG: Kompute sequence EVAL BEGIN
DEBUG: Kompute OpTensorCreate preEval called
DEBUG: Kompute sequence submitting command buffer into compute queue
DEBUG: Kompute OpTensorCreate postEval called
DEBUG: Kompute sequence EVAL SUCCESS
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1351] Kompute Manager evalOp running sequence SUCCESS
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1488] Kompute Manager createInitTensor triggered
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1490] Kompute Manager creating new tensor shared ptr
DEBUG: Kompute Tensor destructor started. Type: {}
DEBUG: Kompute Tensor destructor success
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1365] Kompute Manager evalOp Default triggered
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1334] Kompute Manager evalOp triggered
DEBUG: Kompute Manager creating Sequence object
DEBUG: Kompute Manager createManagedSequence with sequenceName: {} and queueIndex: {}
DEBUG: Kompute Sequence Constructor with existing device & queue
DEBUG: Kompute Sequence creating command pool
DEBUG: Kompute Sequence Command Pool Created
DEBUG: Kompute Sequence creating command buffer
DEBUG: Kompute Sequence Command Buffer Created
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1339] Kompute Manager evalOp running sequence BEGIN
DEBUG: Kompute sequence called BEGIN
INFO: Kompute Sequence command recording BEGIN
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1342] Kompute Manager evalOp running sequence RECORD
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1120] Kompute Sequence record function started
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1128] Kompute Sequence creating OpBase derived class instance
DEBUG: Compute OpBase constructor with params
DEBUG: Kompute OpTensorCreate constructor with params
[2020-11-01 21:19:14.167] [debug] [Kompute.hpp:1139] Kompute Sequence running init on OpBase derived class instance
DEBUG: Kompute OpTensorCreate init called
DEBUG: Kompute Tensor running init with Vulkan params and num data elementS: {}
DEBUG: Kompute Tensor creating buffer
DEBUG: Kompute Tensor creating buffer with memory size: {}, and usage flags: {}
DEBUG: Kompute Tensor buffer created now creating memory
DEBUG: Kompute Tensor allocating memory index: {}, size {}, flags: {}
DEBUG: Kompute Tensor buffer & memory creation successful
DEBUG: Kompute Tensor running init with Vulkan params and num data elementS: {}
DEBUG: Kompute Tensor creating buffer
DEBUG: Kompute Tensor creating buffer with memory size: {}, and usage flags: {}
DEBUG: Kompute Tensor buffer created now creating memory
DEBUG: Kompute Tensor allocating memory index: {}, size {}, flags: {}
DEBUG: Kompute Tensor buffer & memory creation successful
DEBUG: Kompute Tensor local mapping tensor data to host buffer
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1143] Kompute Sequence running record on OpBase derived class instance
DEBUG: Kompute OpTensorCreate record called
DEBUG: Kompute Tensor recordCopyFrom called
DEBUG: Kompute Tensor copying data size {}.
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1345] Kompute Manager evalOp running sequence END
DEBUG: Kompute Sequence calling END
INFO: Kompute Sequence command recording END
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1348] Kompute Manager evalOp running sequence EVAL
DEBUG: Kompute sequence EVAL BEGIN
DEBUG: Kompute OpTensorCreate preEval called
DEBUG: Kompute sequence submitting command buffer into compute queue
DEBUG: Kompute OpTensorCreate postEval called
DEBUG: Kompute sequence EVAL SUCCESS
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1351] Kompute Manager evalOp running sequence SUCCESS
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1488] Kompute Manager createInitTensor triggered
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1490] Kompute Manager creating new tensor shared ptr
DEBUG: Kompute Tensor destructor started. Type: {}
DEBUG: Kompute Tensor destructor success
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1365] Kompute Manager evalOp Default triggered
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1334] Kompute Manager evalOp triggered
DEBUG: Kompute Manager creating Sequence object
DEBUG: Kompute Manager createManagedSequence with sequenceName: {} and queueIndex: {}
DEBUG: Kompute Sequence Constructor with existing device & queue
DEBUG: Kompute Sequence creating command pool
DEBUG: Kompute Sequence Command Pool Created
DEBUG: Kompute Sequence creating command buffer
DEBUG: Kompute Sequence Command Buffer Created
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1339] Kompute Manager evalOp running sequence BEGIN
DEBUG: Kompute sequence called BEGIN
INFO: Kompute Sequence command recording BEGIN
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1342] Kompute Manager evalOp running sequence RECORD
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1120] Kompute Sequence record function started
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1128] Kompute Sequence creating OpBase derived class instance
DEBUG: Compute OpBase constructor with params
DEBUG: Kompute OpTensorCreate constructor with params
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1139] Kompute Sequence running init on OpBase derived class instance
DEBUG: Kompute OpTensorCreate init called
DEBUG: Kompute Tensor running init with Vulkan params and num data elementS: {}
DEBUG: Kompute Tensor creating buffer
DEBUG: Kompute Tensor creating buffer with memory size: {}, and usage flags: {}
DEBUG: Kompute Tensor buffer created now creating memory
DEBUG: Kompute Tensor allocating memory index: {}, size {}, flags: {}
DEBUG: Kompute Tensor buffer & memory creation successful
DEBUG: Kompute Tensor running init with Vulkan params and num data elementS: {}
DEBUG: Kompute Tensor creating buffer
DEBUG: Kompute Tensor creating buffer with memory size: {}, and usage flags: {}
DEBUG: Kompute Tensor buffer created now creating memory
DEBUG: Kompute Tensor allocating memory index: {}, size {}, flags: {}
DEBUG: Kompute Tensor buffer & memory creation successful
DEBUG: Kompute Tensor local mapping tensor data to host buffer
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1143] Kompute Sequence running record on OpBase derived class instance
DEBUG: Kompute OpTensorCreate record called
DEBUG: Kompute Tensor recordCopyFrom called
DEBUG: Kompute Tensor copying data size {}.
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1345] Kompute Manager evalOp running sequence END
DEBUG: Kompute Sequence calling END
INFO: Kompute Sequence command recording END
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1348] Kompute Manager evalOp running sequence EVAL
DEBUG: Kompute sequence EVAL BEGIN
DEBUG: Kompute OpTensorCreate preEval called
DEBUG: Kompute sequence submitting command buffer into compute queue
DEBUG: Kompute OpTensorCreate postEval called
DEBUG: Kompute sequence EVAL SUCCESS
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1351] Kompute Manager evalOp running sequence SUCCESS
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1365] Kompute Manager evalOp Default triggered
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1334] Kompute Manager evalOp triggered
DEBUG: Kompute Manager creating Sequence object
DEBUG: Kompute Manager createManagedSequence with sequenceName: {} and queueIndex: {}
DEBUG: Kompute Sequence Constructor with existing device & queue
DEBUG: Kompute Sequence creating command pool
DEBUG: Kompute Sequence Command Pool Created
DEBUG: Kompute Sequence creating command buffer
DEBUG: Kompute Sequence Command Buffer Created
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1339] Kompute Manager evalOp running sequence BEGIN
DEBUG: Kompute sequence called BEGIN
INFO: Kompute Sequence command recording BEGIN
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1342] Kompute Manager evalOp running sequence RECORD
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1120] Kompute Sequence record function started
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1128] Kompute Sequence creating OpBase derived class instance
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:913] Compute OpBase constructor with params
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1767] Kompute OpAlgoBase constructor with params numTensors: 3
[2020-11-01 21:19:14.168] [info] [Kompute.hpp:1782] Kompute OpAlgoBase dispatch size X: 3, Y: 1, Z: 1
DEBUG: Kompute Algorithm Constructor with device
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1811] Kompute OpAlgoBase shaderFilePath constructo with shader raw data length: 479
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1139] Kompute Sequence running init on OpBase derived class instance
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1826] Kompute OpAlgoBase init called
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1839] Kompute OpAlgoBase fetching spirv data
[2020-11-01 21:19:14.168] [warning] [Kompute.hpp:1884] Kompute OpAlgoBase Running shaders directly from spirv file
[2020-11-01 21:19:14.168] [debug] [Kompute.hpp:1843] Kompute OpAlgoBase Initialising algorithm component
DEBUG: Kompute Algorithm init started
DEBUG: Kompute Algorithm createParameters started
DEBUG: Kompute Algorithm creating descriptor pool
DEBUG: Kompute Algorithm creating descriptor set layout
DEBUG: Kompute Algorithm allocating descriptor sets
DEBUG: Kompute Algorithm updating descriptor sets
DEBUG: Kompue Algorithm successfully run init
DEBUG: Kompute Algorithm createShaderModule started
DEBUG: Kompute Algorithm Creating shader module. ShaderFileSize: {}
DEBUG: Kompute Algorithm create shader module success
DEBUG: Kompute Algorithm calling create Pipeline
AddressSanitizer:DEADLYSIGNAL
=================================================================
==243214==ERROR: AddressSanitizer: SEGV on unknown address 0x0000000000c8 (pc 0x7f7a9e84b1c5 bp 0x60700000c1f0 sp 0x7fff80d84550 T0)
==243214==The signal is caused by a READ memory access.
==243214==Hint: address points to the zero page.
    #0 0x7f7a9e84b1c5  (/usr/lib/libvulkan_radeon.so+0x3741c5)
    #1 0x7f7a9e6081af  (/usr/lib/libvulkan_radeon.so+0x1311af)
    #2 0x7f7a9e5f561c  (/usr/lib/libvulkan_radeon.so+0x11e61c)
    #3 0x7f7a9e5fe184  (/usr/lib/libvulkan_radeon.so+0x127184)
    #4 0x7f7a9e5fe593  (/usr/lib/libvulkan_radeon.so+0x127593)
    #5 0x557d878de759 in kp::Algorithm::createPipeline(std::vector<unsigned int, std::allocator<unsigned int> >) (/home/kezi/Progetti/vulkan-kompute/examples/build-array_multiplication-Desktop-Debug/kompute_array_mult+0x37759)
    #6 0x557d878df6ab in kp::Algorithm::init(std::vector<char, std::allocator<char> > const&, std::vector<std::shared_ptr<kp::Tensor>, std::allocator<std::shared_ptr<kp::Tensor> > >) (/home/kezi/Progetti/vulkan-kompute/examples/build-array_multiplication-Desktop-Debug/kompute_array_mult+0x386ab)
    #7 0x557d878c0c03 in kp::OpAlgoBase<0u, 0u, 0u>::init() /usr/local/include/kompute/Kompute.hpp:1845
    #8 0x557d878d6b80 in bool kp::Sequence::record<kp::OpAlgoBase<0u, 0u, 0u>, std::vector<char, std::allocator<char> > >(std::vector<std::shared_ptr<kp::Tensor>, std::allocator<std::shared_ptr<kp::Tensor> > >, std::vector<char, std::allocator<char> >&&) /usr/local/include/kompute/Kompute.hpp:1141
    #9 0x557d878d88f6 in void kp::Manager::evalOp<kp::OpAlgoBase<0u, 0u, 0u>, std::vector<char, std::allocator<char> > >(std::vector<std::shared_ptr<kp::Tensor>, std::allocator<std::shared_ptr<kp::Tensor> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<char, std::allocator<char> >&&) /usr/local/include/kompute/Kompute.hpp:1343
    #10 0x557d878da889 in void kp::Manager::evalOpDefault<kp::OpAlgoBase<0u, 0u, 0u>, std::vector<char, std::allocator<char> > >(std::vector<std::shared_ptr<kp::Tensor>, std::allocator<std::shared_ptr<kp::Tensor> > >, std::vector<char, std::allocator<char> >&&) /usr/local/include/kompute/Kompute.hpp:1367
    #11 0x557d878b6f4f in main /home/kezi/Progetti/vulkan-kompute/examples/array_multiplication/src/Main.cpp:40
    #12 0x7f7aa22d0151 in __libc_start_main (/usr/lib/libc.so.6+0x28151)
    #13 0x557d878b626d in _start (/home/kezi/Progetti/vulkan-kompute/examples/build-array_multiplication-Desktop-Debug/kompute_array_mult+0xf26d)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/lib/libvulkan_radeon.so+0x3741c5) 
==243214==ABORTING
@axsaucedo
Copy link
Member

Thank you for the update @Kezii - just to make sure I understand your issue, you are saying that you have been able to confirm that it works with NVIDIA graphics cad but not with AMD?

In the example looking at the logs it seems that it fails when calling the createPipeline - this coudl be debugged in various ways. Initially it would be good to understand if the erro may be coming due to the format of the shader. Could you try the example using the SPIR-V bytes instead of the string?

@axsaucedo
Copy link
Member

axsaucedo commented Nov 1, 2020

More specifically, this would be reading from the shader header files for the shaderopmult.hpp converted using the Kompute shader-to-cpp-header script, which would be similar to how the android example provides the option to load the shader:
https://github.com/EthicalML/vulkan-kompute/blob/9babbc54ee96af1495b59c50f2d9116e75e51f4c/examples/android/android-simple/app/src/main/cpp/KomputeModelML.cpp#L67-L72

@Kezii
Copy link
Author

Kezii commented Nov 1, 2020

you are saying that you have been able to confirm that it works with NVIDIA graphics cad but not with AMD?

Yes, I told a friend with my same OS and a nvidia card and it worked for him, not really sure about the conditions since I do not have control on his pc (though he used my same aur package)
I tested on another pc with an intel igpu and it segfaults too

Could you try the example using the SPIR-V bytes instead of the string?

I tried your code on the last comment and it worked!
I'll take some time to figure out what happened, since, as you can imagine, this is the first time I got the library to work

Thanks for the quick reply

@axsaucedo
Copy link
Member

Awesome! Thank you for confirming @Kezii - I have noticed that only a few devices support provisioning of shaders in their raw string format. I do suspect this is less due to the graphics card itself, as normally the compilation of shaders from raw stirng into SPIR-V bytes would be done by a library like shaderc, so I suspect it's more due to relevant dependencies like shaderc not being available.

I will open an issue in the Vulkan forums to find more about this, and will track it via #85

Thank you very much for rerporting this tho, in regards to your points about the example, I will update the example to have the same macro guard so it actually loads the shader from the spir-v format by default, so it's more robust.

@Kezii
Copy link
Author

Kezii commented Nov 2, 2020

have noticed that only a few devices support provisioning of shaders in their raw string format.

Yeah I'm surprised that this worked, I was assuming that it was compiling the shaders under the hood (like rust's vulkano library)

Indeed if I point the shader to the .comp file it segfaults, but if I compile the shader and put the path of the .v file it works perfectly

Thanks for addressing this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants