-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ViennaCL fatal error in GPU mode #6259
Comments
So - the most likely reason for it to fail is that you cannot switch a network from CPU to GPU with OpenCL Caffe - the layers need to know which device they will use beforehand. So you need to change the example to:
This is due to how OpenCL devices are managed in an OOP way - something not required by CUDA.
You can also try: If you look at your
Since the latest ROCm and amdgpu-pro, AMD seems to report the work item sizes wrong. It should be 256x256x256 (as evident also by the max work group size parameter). What does this mean? Kernels that rely on auto-selecting the workgroup sizes, fail with the current version of ViennaCL until we use the min of "Max work group size" and "Max work item sizes" or AMD fixes the bug. |
So, I used the same driver as you and ran it on Polaris and Vega. The second observation I made is that ViennaCL's GEMM does no longer seem to be compatible (you will get a memory access violation error from AMD's OpenCL) with AMD's latest OpenCL driver (it works on Intel and nVidia though). Therefore it is absolutely necessary to use CLBlast or clBlas (and recompile Caffe with it). |
Thanks for the prompt response and looking into this! Moving the calls to I'll take the other steps you recommended too. Thanks again. |
@jnschaeffer No unfortunately it's a bit underdocumented as of now. The reason is that I'm mainly spending time preparing the next big release with a lot of additional features (quantized data types, faster network inference, device abstracted backend) before I add full documentation. Please report back if you can successfully run the network. |
Just checked; the network runs successfully with Arch's |
Issue summary
Trying to run models in caffe using GPU mode with
amdgpu
on Arch results in a ViennaCL kernel start error and subsequent crash. CPU mode has no issues.This doesn't seem to be the case for all models, but even when the GPU is used it is several times slower than the CPU. This may or may not be related.
It seems like this issue may be related to #5804 and #6258. All caffe tests build and run successfully, however. Additionally, the machine this was tested on is using the opencl-amd package, which seems to install the AMDGPU-PRO OpenCL libraries as described in #5804.
Output from Python:
Steps to reproduce
With an AMD card on the
opencl
branch of caffe, run through the entire IPython notebook in examples/00-classification.ipynb. Alternatively, this gist is a modified version of (part of) the same example with GPU mode swapped in for CPU mode.Your system configuration
Operating system: Arch Linux x86_64 4.15.5-1-ARCH
Compiler:
g++ (GCC) 7.3.0
BLAS: OpenBLAS
Python or MATLAB version (for pycaffe and matcaffe respectively): Python
clinfo
output: herecaffe device_query
output: hereThe text was updated successfully, but these errors were encountered: