[OpenCL][Mxnet] Error: Entry function 'fuse_conv2d_relu_7__kernel2' uses too much shared data #1302

nguyenducthinhdl · 2018-06-20T06:36:38Z

Dear contributors,

I met an issue related to the OpenCL targeting with mxnet model on TVM processing:

This is message error:

[14:32:18] src/nnvm/legacy_json_util.cc:190: Loading symbol saved by previous version v0.12.0. Attempting to upgrade...
[14:32:18] src/nnvm/legacy_json_util.cc:198: Symbol successfully upgraded!
model compiled.
[14:32:23] src/runtime/opencl/opencl_device_api.cc:235: Initialize OpenCL platform 'NVIDIA CUDA '
[14:32:24] src/runtime/opencl/opencl_device_api.cc:260: opencl(0)='GeForce GTX 750 Ti ' cl_device_id=0x83dde60
Traceback (most recent call last):
File "/home/abc/work/code.py", line 81, in
m.run()
File "/usr/local/lib/python3.5/dist-packages/tvm-0.4.0-py3.5-linux-x86_64.egg/tvm/contrib/graph_runtime.py", line 113, in run
self._run()
File "/usr/local/lib/python3.5/dist-packages/tvm-0.4.0-py3.5-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 183, in call
ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
File "/usr/local/lib/python3.5/dist-packages/tvm-0.4.0-py3.5-linux-x86_64.egg/tvm/_ffi/base.py", line 66, in check_call
raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: [14:32:26] src/runtime/module_util.cc:52: Check failed: ret == 0 (-1 vs. 0) [14:32:26] src/runtime/opencl/opencl_module.cc:141: OpenCL build error for device=0x83dde60ptxas error : Entry function 'fuse_conv2d_relu_9__kernel2' uses too much shared data (0x1ca64 bytes, 0xc000 max)
ptxas error : Entry function 'fuse_conv2d_relu_7__kernel2' uses too much shared data (0xda44 bytes, 0xc000 max)

Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.5/dist-packages/tvm-0.4.0-py3.5-linux-x86_64.egg/tvm/libtvm.so(dmlc::StackTraceabi:cxx11+0x5a) [0x7f61e38f8b7a]
[bt] (1) /usr/local/lib/python3.5/dist-packages/tvm-0.4.0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x5b26ae) [0x7f61e3cea6ae]
[bt] (2) /usr/local/lib/python3.5/dist-packages/tvm-0.4.0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x607127) [0x7f61e3d3f127]
[bt] (3) /usr/local/lib/python3.5/dist-packages/tvm-0.4.0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x6055f7) [0x7f61e3d3d5f7]
[bt] (4) /usr/local/lib/python3.5/dist-packages/tvm-0.4.0-py3.5-linux-x86_64.egg/tvm/libtvm.so(TVMFuncCall+0x5e) [0x7f61e3cd7bde]
[bt] (5) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7f622aae9e20]
[bt] (6) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7f622aae988b]
[bt] (7) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(_ctypes_callproc+0x49a) [0x7f622aae401a]
[bt] (8) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(+0x9fcb) [0x7f622aad7fcb]
[bt] (9) /usr/bin/python3.5(PyObject_Call+0x47) [0x5c1797]

Have you got any idea to resolve this trouble?

Thanks a lot

tqchen · 2018-06-20T16:49:18Z

please use https://discuss.tvm.ai/ for general support questions

FrozenGene · 2018-06-21T12:25:59Z

I also encounter this problem when I run one complex model. So I want to answer here, because I don't find you ask this question in discuss.tvm.ai. I have pointed out what's wrong of OpenCL kernel code. My wrong OpenCL kernel code is:

__local float pad_temp_global_global_shared[33800];

I think your model should also have similar OpenCL code. The problem is 33800. NV only allows shared data's size 48K and Intel's GPU allows 64K. You can call clinfo command to check it out.

You can refer this: #525 When I tune as the suggestion, I can reduce the array size below 32K. I think it can also help you.

nguyenducthinhdl · 2018-06-25T02:39:10Z

Many thanks for you @FrozenGene ,

I will refer your solution.

Regards

ndcuong91 · 2018-07-04T06:55:09Z

@FrozenGene thanks for your information. But i think the fuse function here use too much memory (~111kb) so we need to reduce array size as ur suggestion. Can you share the way you tune your model in detail?

tqchen closed this as completed Jun 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenCL][Mxnet] Error: Entry function 'fuse_conv2d_relu_7__kernel2' uses too much shared data #1302

[OpenCL][Mxnet] Error: Entry function 'fuse_conv2d_relu_7__kernel2' uses too much shared data #1302

nguyenducthinhdl commented Jun 20, 2018 •

edited

Loading

tqchen commented Jun 20, 2018

FrozenGene commented Jun 21, 2018 •

edited

Loading

nguyenducthinhdl commented Jun 25, 2018

ndcuong91 commented Jul 4, 2018

[OpenCL][Mxnet] Error: Entry function 'fuse_conv2d_relu_7__kernel2' uses too much shared data #1302

[OpenCL][Mxnet] Error: Entry function 'fuse_conv2d_relu_7__kernel2' uses too much shared data #1302

Comments

nguyenducthinhdl commented Jun 20, 2018 • edited Loading

tqchen commented Jun 20, 2018

FrozenGene commented Jun 21, 2018 • edited Loading

nguyenducthinhdl commented Jun 25, 2018

ndcuong91 commented Jul 4, 2018

nguyenducthinhdl commented Jun 20, 2018 •

edited

Loading

FrozenGene commented Jun 21, 2018 •

edited

Loading