You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to get caffe to work on a i.MX6QP+ which has a GC2000+ (Full Profile), I modified the source code in ocl_device_program.cpp to remove the "vect_type_hint" as the driver seems to fail compilation if this is in the kernel
clinfo: /usr/lib/libOpenCL.so.1: no version information available (required by clinfo)
Number of platforms 1
Platform Name Vivante OpenCL Platform
Platform Vendor Vivante Corporation
Platform Version OpenCL 1.2 V6.2.4.p1.150331
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix viv
Platform Name Vivante OpenCL Platform
Number of devices 1
Device Name Vivante OpenCL Device GC2000+.5450.0000
Device Vendor Vivante Corporation
Device Vendor ID 0x564956
Device Version OpenCL 1.2
Driver Version OpenCL 1.2 V6.2.4.p1.150331
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 4
Max clock frequency 500MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 1024
=== CL_PROGRAM_BUILD_LOG ===
(6:0) : error : syntax error at 'kernel'
Preferred work group size multiple <getWGsizes:1200: create kernel : error -45>
Preferred / native vector sizes
char 4 / 4
short 4 / 4
int 4 / 4
long 4 / 4
half 0 / 0 (n/a)
float 4 / 4
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 268435456 (256MiB)
Error Correction support Yes
Max memory allocation 134217728 (128MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 8192 (8KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 65536 pixels
Max 1D or 2D image array size 8192 images
Max 2D image size 8192x8192 pixels
Max 3D image size 8192x8192x8192 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Global
Local memory size 32768 (32KiB)
Max number of constant args 9
Max constant buffer size 65536 (64KiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution Yes
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1000ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
printf() buffer size 1048576 (1024KiB)
Built-in kernels (n/a)
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_gl_sharing
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [viv]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000
The GPU has 768MiB of assigned RAM, I can assign more but this doesn't seem to make an effect. and while monitoring the memory it never gets to a point where is close to OOM
running this command:
caffe time --model models/yolo/yolo_tiny_deploy.prototxt -gpu 0 |& tee error.log
Tried solutions
I tried setting different max_work_item sizes in ocl_device, hardcoded to 256 in each dim
System configuration
OS:Linux 4.9.109
System:PixiePro+
RAM=4GiB
Thanks!
The text was updated successfully, but these errors were encountered:
Issue summary
Hi,
I'm trying to get caffe to work on a i.MX6QP+ which has a GC2000+ (Full Profile), I modified the source code in ocl_device_program.cpp to remove the "vect_type_hint" as the driver seems to fail compilation if this is in the kernel
After thet the kernel does compile but I get a failure when trying to execute:
CL_OUT_OF_RESOURCES
while trying to "Perform forward"
of this model:
https://github.com/xingwangsfu/caffe-yolo/blob/master/prototxt/yolo_tiny_deploy.prototxt
Steps to reproduce
I'm using this board:
https://www.amazon.com/Code-Modules-Inc-PixieBoard-Computing/dp/B07DQBPNZT/ref=sr_1_1?ie=UTF8&qid=1529860144&sr=8-1&keywords=code+and+modules+pixiepro
With kernel 4.9.109
I get this out of clinfo
clinfo: /usr/lib/libOpenCL.so.1: no version information available (required by clinfo)
Number of platforms 1
Platform Name Vivante OpenCL Platform
Platform Vendor Vivante Corporation
Platform Version OpenCL 1.2 V6.2.4.p1.150331
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix viv
Platform Name Vivante OpenCL Platform
Number of devices 1
Device Name Vivante OpenCL Device GC2000+.5450.0000
Device Vendor Vivante Corporation
Device Vendor ID 0x564956
Device Version OpenCL 1.2
Driver Version OpenCL 1.2 V6.2.4.p1.150331
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 4
Max clock frequency 500MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 1024
=== CL_PROGRAM_BUILD_LOG ===
(6:0) : error : syntax error at 'kernel'
Preferred work group size multiple <getWGsizes:1200: create kernel : error -45>
Preferred / native vector sizes
char 4 / 4
short 4 / 4
int 4 / 4
long 4 / 4
half 0 / 0 (n/a)
float 4 / 4
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 268435456 (256MiB)
Error Correction support Yes
Max memory allocation 134217728 (128MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 8192 (8KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 65536 pixels
Max 1D or 2D image array size 8192 images
Max 2D image size 8192x8192 pixels
Max 3D image size 8192x8192x8192 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Global
Local memory size 32768 (32KiB)
Max number of constant args 9
Max constant buffer size 65536 (64KiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution Yes
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1000ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
printf() buffer size 1048576 (1024KiB)
Built-in kernels (n/a)
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_gl_sharing
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [viv]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Vivante OpenCL Platform
Device Name Vivante OpenCL Device GC2000+.5450.0000
The GPU has 768MiB of assigned RAM, I can assign more but this doesn't seem to make an effect. and while monitoring the memory it never gets to a point where is close to OOM
running this command:
caffe time --model models/yolo/yolo_tiny_deploy.prototxt -gpu 0 |& tee error.log
Tried solutions
I tried setting different max_work_item sizes in ocl_device, hardcoded to 256 in each dim
System configuration
OS:Linux 4.9.109
System:PixiePro+
RAM=4GiB
Thanks!
The text was updated successfully, but these errors were encountered: