Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL backend performance question on Windows for Open NSFW Resnet 50 with GTX 680 #71

Open
TechnikEmpire opened this issue Jan 11, 2018 · 3 comments
Assignees

Comments

@TechnikEmpire
Copy link

TechnikEmpire commented Jan 11, 2018

Issue summary

I hope I'm not breaking the rules but I'm just wondering if the performance I'm seeing of OpenCL back end on my system seems right. I did not use the built in systems (cmake/ninja) to build, but rather I manually constructed a Visual Studio 2017 project and compiled all required dependencies, created library exports etc. Built from head on opencl branch from caffe.

I am seeing that the processing is being offloaded to the GPU, but the GPU load is very small (half a percent). I'm seeing around 75 msec execution time on the forward pass using this model. I have not yet tested batching, this is performing single instance classification.

I'm just curious if this performance is expected or if it's symptomatic of me doing something wrong. This execution speed is about double the speed of executing on the CPU with openCV's DNN module (I know that's apples to oranges, but still).

Either way, thanks for your work. When I have time, I'll be trying out batching to see if that gives a boost or not.

P.S. As a side note, I had to change the bias_fillter type from xavier to constant to avoid random crashes.

Steps to reproduce

Compile from source and benchmark forward pass execution time with the openCL back end using the linked Resnet 50 model.

Your system configuration

Operating system: Windows 10 x64
Compiler: MSVC 15 (2017)
CUDA version (if applicable):
CUDNN version (if applicable):
BLAS: OpenBLAS/clBLAS
Python or MATLAB version (for pycaffe and matcaffe respectively):

@TechnikEmpire TechnikEmpire changed the title OpenCL backend performance on Windows for Open NSFW Resnet 50 on GTX 670 OpenCL backend performance on Windows for Open NSFW Resnet 50 on GTX 680 Jan 11, 2018
@TechnikEmpire TechnikEmpire changed the title OpenCL backend performance on Windows for Open NSFW Resnet 50 on GTX 680 OpenCL backend performance question on Windows for Open NSFW Resnet 50 with GTX 680 Jan 11, 2018
@naibaf7
Copy link
Owner

naibaf7 commented Jan 12, 2018

@TechnikEmpire I personally haven't tested ResNet on a GTX 680 yet. If you provide me the whole code you use to benchmark I can do some benchmark on other GPUs if you wish to have a ballpark reference. Thanks for your interest.

@naibaf7 naibaf7 self-assigned this Jan 12, 2018
@naibaf7
Copy link
Owner

naibaf7 commented Jan 12, 2018

@TechnikEmpire It also has to be noted that performance can vary greatly depending on what BLAS you use for your GPU. ViennaCL is default, but you should also try CLBlast, clBlas or ISAAC.
Enabling LibDNN option can also increase performance (it's a cuDNN replacement).

@TechnikEmpire
Copy link
Author

@naibaf7 thanks. I'll have to check over how I compiled everything. Afaik I compiled with clblas and lindnn back end but I could have goofed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants