Skip to content
This repository has been archived by the owner on Jan 7, 2025. It is now read-only.

inference not working on production server #297

Closed
pclove1 opened this issue Sep 12, 2015 · 8 comments
Closed

inference not working on production server #297

pclove1 opened this issue Sep 12, 2015 · 8 comments
Labels

Comments

@pclove1
Copy link
Contributor

pclove1 commented Sep 12, 2015

Hi,

I have an issue around 'inference' feature in DIGITS.

When I am running a development server via digits-devserver, it works fine as shown below:
image

However, when I am running a production server via digits-server, I am getting an error message like below:
image

and, this is the error message in the console log:

WARNING: Logging before InitGoogleLogging() is written to STDERR
F0911 23:19:51.940702  7290 math_functions.cu:81] Check failed: error == cudaSuccess (4 vs. 0)  unspecified launch failure
*** Check failure stack trace: ***

DIGITS version:

v2.1.0
I installed DIGITS from the source.

Nvidia's caffe version:

v0.13.2

cudnn version:

v3

cnmem version:

v1.0.0

Results of pip freeze

Cython==0.23
Flask==0.10.1
-e git://github.com/lukeyeager/flask-autodoc.git@375dd08c702fbf352baca5c974be5311ef30aa67#egg=Flask_Autodoc-dev
Flask-SocketIO==0.6.0
Flask-WTF==0.12
Jinja2==2.8
MarkupSafe==0.23
Pillow==2.9.0
PyYAML==3.11
WTForms==2.0.2
Werkzeug==0.10.4
argparse==1.2.1
beautifulsoup4==4.4.0
coverage==4.0b3
coveralls==1.0b1
decorator==4.0.2
distribute==0.7.3
docopt==0.6.2
eventlet==0.17.4
funcsigs==0.4
gevent==1.0.2
gevent-socketio==0.3.6
gevent-websocket==0.9.5
greenlet==0.4.9
gunicorn==19.3.0
h5py==2.5.0
ipython==4.0.0
ipython-genutils==0.1.0
itsdangerous==0.24
leveldb==0.193
lmdb==0.87
matplotlib==1.4.3
mock==1.3.0
networkx==1.10
nose==1.3.7
numpy==1.9.2
pandas==0.16.2
path.py==7.6
pbr==1.7.0
pexpect==3.3
pickleshare==0.5
protobuf==3.0.0a1
pydot2==1.0.33
pyparsing==2.0.3
python-dateutil==1.5
python-engineio==0.6.6
python-gflags==2.0
python-socketio==0.4.0
pytz==2015.4
requests==2.7.0
scikit-image==0.11.3
scikit-learn==0.16.1
scipy==0.16.0
selenium==2.47.1
simplegeneric==0.8.1
six==1.9.0
traitlets==4.0.0
wsgiref==0.1.2
yolk==0.4.3

GPUs configuration:

GTX TITAN X + Quadro K620

Please let me know if there is a way to make the log more verbose so that I can provide more information.

@lukeyeager lukeyeager added the bug label Sep 14, 2015
@lukeyeager
Copy link
Member

@pclove1, thanks for the detailed bug report! I've seen something similar myself - I'll get back to you on this ...

@lukeyeager
Copy link
Member

I filed this at NVIDIA/caffe#38 and forgot to update here.

@lukeyeager
Copy link
Member

@pclove1 can you try the NVcaffe v0.14 RC? Use git checkout v0.14.0-rc.1 or git checkout caffe-0.14. This issue seems to be fixed for me with that branch - NVIDIA/caffe#42 (comment).

@pclove1
Copy link
Contributor Author

pclove1 commented Nov 9, 2015

Thanks for your follow-up, @lukeyeager.

Yeah, I tried NVcaffe v0.14 RC and it seems this issue is gone.
While trying it out, I filed an cmake build issue here.

Plus, I found that ./digits-devserver version parsing seems unhappy with a pattern like 0.14.0-rc.1 with the following message.

ERROR: Could not get version information from caffe at "</path/to>/caffe/build/tools/caffe". Are you using the NVIDIA fork?

I avoided this by overwriting the caffe version as 0.14.0.
Digits version that I used was v2.1.0.

Thanks.

@lukeyeager
Copy link
Member

Plus, I found that ./digits-devserver version parsing seems unhappy with a pattern like 0.14.0-rc.1 with the following message.

Yep, good debugging. I had to address this with #336.

@lukeyeager
Copy link
Member

@pclove1 can you try using the v3.0 RC3 deb packages to see if the issue persists?

https://github.com/NVIDIA/DIGITS/blob/digits-3.0/docs/UbuntuInstall.md

@pclove1
Copy link
Contributor Author

pclove1 commented Dec 18, 2015

I tried the following environment, and I can confirm that the issue doesn't happen any longer.

  • cuDNN: cudnn-7.0-linux-x64-v4.0-rc
  • Caffe: v0.14.0-rc.3
  • DIGITS: v3.0.0-rc.3

@lukeyeager
Copy link
Member

Great thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants