Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CL_OUT_OF_RESOURCES on GC2000+ #73

Open
tequilaguru opened this issue Jun 24, 2018 · 0 comments
Open

CL_OUT_OF_RESOURCES on GC2000+ #73

tequilaguru opened this issue Jun 24, 2018 · 0 comments

Comments

@tequilaguru
Copy link

Issue summary

Hi,

I'm trying to get caffe to work on a i.MX6QP+ which has a GC2000+ (Full Profile), I modified the source code in ocl_device_program.cpp to remove the "vect_type_hint" as the driver seems to fail compilation if this is in the kernel

  case KERNEL_HINT_VEC_TYPE:
    /*ss << "__attribute__((vec_type_hint(" << std::get<1>(hints[i])
       << ")))" << std::endl;*/
    break;

After thet the kernel does compile but I get a failure when trying to execute:

CL_OUT_OF_RESOURCES

while trying to "Perform forward"

of this model:

https://github.com/xingwangsfu/caffe-yolo/blob/master/prototxt/yolo_tiny_deploy.prototxt

Steps to reproduce

I'm using this board:
https://www.amazon.com/Code-Modules-Inc-PixieBoard-Computing/dp/B07DQBPNZT/ref=sr_1_1?ie=UTF8&qid=1529860144&sr=8-1&keywords=code+and+modules+pixiepro

With kernel 4.9.109

I get this out of clinfo

clinfo: /usr/lib/libOpenCL.so.1: no version information available (required by clinfo)
Number of platforms 1
Platform Name Vivante OpenCL Platform
Platform Vendor Vivante Corporation
Platform Version OpenCL 1.2 V6.2.4.p1.150331
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix viv

Platform Name Vivante OpenCL Platform
Number of devices 1
Device Name Vivante OpenCL Device GC2000+.5450.0000
Device Vendor Vivante Corporation
Device Vendor ID 0x564956
Device Version OpenCL 1.2
Driver Version OpenCL 1.2 V6.2.4.p1.150331
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 4
Max clock frequency 500MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 1024
=== CL_PROGRAM_BUILD_LOG ===
(6:0) : error : syntax error at 'kernel'
Preferred work group size multiple <getWGsizes:1200: create kernel : error -45>
Preferred / native vector sizes
char 4 / 4
short 4 / 4
int 4 / 4
long 4 / 4
half 0 / 0 (n/a)
float 4 / 4
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 268435456 (256MiB)
Error Correction support Yes
Max memory allocation 134217728 (128MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 8192 (8KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 65536 pixels
Max 1D or 2D image array size 8192 images
Max 2D image size 8192x8192 pixels
Max 3D image size 8192x8192x8192 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Global
Local memory size 32768 (32KiB)
Max number of constant args 9
Max constant buffer size 65536 (64KiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution Yes
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1000ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
printf() buffer size 1048576 (1024KiB)
Built-in kernels (n/a)
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_gl_sharing

NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [viv]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000

The GPU has 768MiB of assigned RAM, I can assign more but this doesn't seem to make an effect. and while monitoring the memory it never gets to a point where is close to OOM

running this command:

caffe time --model models/yolo/yolo_tiny_deploy.prototxt -gpu 0 |& tee error.log

Tried solutions

I tried setting different max_work_item sizes in ocl_device, hardcoded to 256 in each dim

System configuration

OS:Linux 4.9.109
System:PixiePro+
RAM=4GiB

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant