-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proper Caffe opencl branch installation Instructions for Intel GPU #5099
Comments
@atlury It can be put in any one of the paths where CMake will find it, such as next to the folder into which you cloned Caffe. The build instructions are different from the Linux instructions, since it is a script that automatically takes care of CMake configuration and downloading dependencies. Usually there's no huge need to worry about configuration on Windows, since it's designed to just work. However I will give you a quick explanation: I will update the Readme after Christmas when I have holidays. I unfortunately don't have time for a detailed step-by-step right now. |
@naibaf7 Is OpenCL BLAS and ISAAC still needed?? |
@atlury This branch is not the same as https://github.com/01org/caffe/wiki/clCaffe Strict requirements:
Optional requirements:
|
Thanks @naibaf7 |
@atlury |
Fabian, will the windows build support compiling with Mingw-64. Kindly let me know. If any instructions specific to it? Mocosoft studio is too bloated.. |
@atlury Currently no, not that I am aware of. @willyd is the main contributor and maintainer of windows building, so maybe he can answer that. |
I have no intention to support mingw-64 as CUDA does not support mingw as a host compiler on windows. That being said I welcome any PRs related to support mingw64 if they don't add too much complexity to the build. |
@willyd |
Does the windows opencl build include support for engine: SPATIAL? When I include engine: SPATIAL or engine: INTEL_SPATIAL, it get one of the following errors Layer conv1 has unknown engine. The wiki is confusing read.me https://github.com/BVLC/caffe/tree/opencl It mentions both add entry engine: SPATIAL to all convolution layer specification. as well as "engine: INTEL_SPATIAL <-------------------------- this line!" Which one? And it runs fine without the engine: spatial in prototxt. opencl-caffe-test.exe imagenet_deploy.prototxt bvlc_reference_caffenet.caffemodel imagenet_mean.binaryproto synset_words.txt truck.jpg Also here are a few "other" observations |
Further C:\Downloads\xxx.caffe-opencl-build\bin>caffe device_query
I0108 12:35:04.889681 19872 common.cpp:408] Device id: 0
I0108 12:35:04.897233 19872 common.cpp:408] Device id: 1
I0108 12:35:04.905594 19872 common.cpp:408] Device id: 2
|
Looks good to me, although it seems you have both a newer OpenCL 2.1 and an older OpenCL 1.2 installed. As it's still a Haswell CPU I am not sure if Intel already has a 2.1/2.0 driver for your chip. But you should try to update your OpenCL SDK for your GPU. Anyways, if you want to use INTEL_SPATIAL you need to also enable it at compile time. After that it becomes the standard engine on Intel GPU devices. however the Intel spatial kernel has not been thoroughly tested on Windows yet. |
I will try to update opencl sdk and i just saw your commits, will try to it enable, recompile and test them and report back. |
Okie with if NOT DEFINED USE_INTEL_SPATIAL set USE_INTEL_SPATIAL=1 Build_win.cmd throws the following error. C:\Downloads\caffe-opencl\build\ALL_BUILD.vcxproj" (default target) (1) -> (ClCompile target) -> C:\Downloads\caffe-opencl\src\caffe\layers\conv_layer_spatial.cpp(1453): error C2572: 'caffe::ConvolutionLayerSpatial::swizzleWeights': redefinition of default argument: parameter 1 [C:\Downloads\caffe-opencl\build\src\caffe\caffe.vcxproj] C:\Downloads\caffe-opencl\src\caffe\layers\conv_layer_spatial.cpp(1458): error C2572: 'caffe::ConvolutionLayerSpatial::swizzleWeights': redefinition of default argument: parameter 1 [C:\Downloads\caffe-opencl\build\src\caffe\caffe.vcxproj] |
Ok, I'll look into that. |
Hi all, |
@gfursin It should, by a large margin. LibDNN expects the GPU to have a different memory architecture than what Intel chips have, so it does not run optimally at the moment. |
Super! Thanks a lot! |
By the way, @atlury, when selecting device 1 and 2, "caffe time" crashed each time after around 10 seconds - did you have the same behavior? Thanks! |
@gfursin No I did no run caffe time (I will try to and report). I was frustrated with windows and later shifted to Ubuntu 17.04. See my comment here on linux. It works with spatial and I get more than 30 fps (VGG) in linux. #5165 There is an Intel paper published here (clcaffe) Where the following benchmarks (page 28 GT3 GPU) were supported using INTEL SPATIAL in convolution layers. I really want to test out Object Detection (not just classification) as well using INTEL SPATIAL but there is no example as such anywhere. I doubt if the if the Caffe Layers are ready yet? @naibaf7 ? @gongzg are there any source code for the above tests that we can try? Further LiDNN has been made to work with tiny-dnn which is exciting (although not many pre-trained models in there). I also want to test out quantization and see how opencl can help there (8-bit, XNOR etc). Finally object detection in opencl in real time would be awesome!!! i hope @naibaf7 can thrown in some light. |
@atlury I'll get back to you next week regarding the more difficult questions. |
Wow I just read your paper...the concept of Strided kernels seems very impressive. Not hijack this thread but all these will eventually need to be tested in Opencl under windows but before that.... Is this a python only implementation? No c++? Are there any pre-trained models? Is this where the repo is https://github.com/naibaf7/PyGreentea/tree/master/examples ? Yes I am gonna use LibDNN... |
@atlury Yes the original interface was C++ but we switched to python. However if you want to provide the data through HDF5 or your own C++ interface that will work too. Just use the network generator codes that I provide in python to help you create the correct prototxt for SK/U-type networks. |
@atlury Some of the benchmark data are measured by using the convnet-benchmarks and you can reproduce it at your platform. We don't have other examples to share publicly currently. |
@atlury - thanks a lot for references! I had many troubles installing and using OpenCL for Intel GPU on Ubuntu in the past (had to recompile Linux kernel), but maybe latest drivers will work ok - need to check that. By the way, in #5165 you have a snapshot of a webcam + Caffe classification with FPS measurements - may I ask you which program did you use for that? Thanks a lot!!! |
Please do the following.
include engine: INTEL_SPATIAL for all convolutional layers in your deploy.proto Get the synset_words.txt
Just make sure the input_dim is 1 (in your proto) and not 10 (you are only giving it one image at a time) with 3 channels and the resizing is automatic. Any additional help buzz me on skype:atlury or gtalk:atlury Please note that this will only work in linux and opencl support for windows is still being worked on by @naibaf7 |
Thank you very much @atlury for all details - very much appreciated - I will test it soon! By the way, I started automating installation of Caffe on Windows (CPU and OpenCL mode) using Collective Knowledge Framework, but it still needs more testing: https://github.com/dividiti/ck-caffe |
@bxk-sonavex It will work, but not with the Intel convolutions, therefore non optimal performance. |
@naibaf7 Yes, I am using Intel SDK v6.3. I found a workaround here (#5575) and it works for me. Now I got the opencl branch compiled. Further, I tested my build using the mnist example provided in the examples folder. When using CPU (by modifying lenet_solver.prototxt), the train_lenet ran without any problem and the final training accuracy is 0.9902, which is as expected.
However, when using GPU, I got "caffe.exe has stopped working" error message window and the accuracy is just 0.1009.
Could you give me some leads on what happened? How to solve it? Or is this the thing that @gongzg mentioned?
The places I modified from the default
Should I set the |
When set USE_INTEL_SPATIAL=1, the branch cannot be compiled. The error is ninja: build stopped: subcommand failed. |
@naibaf7 The 01org version works fine on Windows now. I'm still busy on other things so I haven't got enough time to submit all fixes to this OpenCL branch. Will do that when I have some time in the near future. @bxk-sonavex You can try the 01org version following the wiki page, and if you met any problem with that, please let me know. |
@gongzg Thanks! Following the instruction on https://github.com/01org/caffe/wiki/clCaffe#windows-support-for-intel-gen-platform, I got the error message:
FYI: UPDATE:
Supposedly, the files should be generated automatically. Then I got the following error:
Disabled |
@bxk-sonavex It seems that it was already built successfully. You need to copy the dll files to the executable files's directory: |
@gongzg Added the folders of the two dlls in the system path instead of copying them to the test folder. Now got another error, which looks pretty serious...
I am using Intel Iris Plus Graphics 650 and intel_sdk_for_opencl_setup_6.3.0.1904. Any thoughts and solution? |
@bxk-sonavex You need to update your Intel Graphics driver to the latest version. |
@gongzg Thanks, that solved the compiling error. When running the tests, I got a whole bunch of errors like (may not catch all of them)
Should these errors be concerned? Anyway, I am testing the build using the mnist example. It's extremely slow, even much much slower than the original Caffe using CPU. And there are some warnings (repeated several times)
Any idea? |
Why dont you work with running caffe in linux for the time being? Devs i guess are focused more on getting the FP16, INT8 code etc running smoothly especially naibaf7 (david). Proper windows support will come eventually. Just a suggestion though. |
@atlury I'd love to!!! But our system is Windows 10 + Intel Iris ... Have any idea on when the Windows support will come? Or, any other DL platform works (using GPU)? |
@gongzg Just want to update you with the performance Wondering what is the performance on Linux. Then, I could have a basic idea on how much speed up using Intel GPU (OpenCL) vs CPU. Thanks!! |
Ben did you enable the opencl kernels? Did you try using INTEL_SPATIAL? |
@atlury What do you mean "enable the opencl kernels"? Yes, I followed the instruction here (https://github.com/01org/caffe/wiki/clCaffe#how-to-build) and did "set USE_INTEL_SPATIAL=1" in command line (not directly modifying the build_win.cmd file). UPDATE: |
Ben you will need to include INTEL_SPATIAL for all convolutional layers in your deploy.proto. I have personally tested it in real time in linux. "I have tested on an Intel tv stick, webcam using Intel Spatial kernels and using 19-layer vgg model. I am able to get real time classification and all under 3.5 watts" Windows should also work. |
@bxk-sonavex for the issue on 01org version, please open an issue there. There are some test failures due to FP16 precision issue on those gradient test cases which is not critical. The performance is extremely slow which should be caused by the auto-tuning. It should be much faster when you run it again. You can firstly try to use the build/tools/caffe to measure the forward performance for AlexNet. |
By the way, I just noticed that @CNugteren released new 1.2.0 version of his autotuned CLBlast library a few days ago. I checked it and it seems to be working with Caffe on my Windows 10 Lenovo Laptop with old Intel 4400 GPU (as well as on Linux) - so it can be a nice addition to Caffe since previous CLBlast version was seg-faulting on Windows! If you are interested, you can check the speed of Caffe with LibDNN and CLBlast for example on SqueezeDet as following (the same procedure on both Windows and Linux):
It will take some time since CK will attempt to detect your environment and compilers, After that you can just install SqueezeDet and run internal time:
The first run can be a bit slow due to kernel compilation and caching so the second run will be much faster! You can also benchmark image classification:
Not related to Intel but just a note that there seems to be a minor bug when compiling Caffe with CLBlast 1.2.0 for Android ARM64 using Android GCC 4.9.x ("to_string" not found in std class):
Would be nice to fix it since CLBlast 1.1.0 works fine on Android... In such case, it will be working with Caffe across all platforms. Hope it's of any help and have a good weekend! |
Not sure whether you mean that this is a bug in CLBlast or in Caffe? In any case, CLBlast has this implemented in a special Android header. Perhaps that could be used within Caffe as well? |
@CNugteren - I just checked and the problem is not in CLBlast. I just forgot a patch in the CK which was fixing LibDNN for Android (so my fault). I have added it (https://github.com/dividiti/ck-caffe/blob/master/package/lib-caffe-bvlc-opencl-clblast-universal/patch.android/android.fgg.patch3) and it's now possible to compile Caffe with CLBlast and libDNN. I checked classification and benchmarking examples on my Samsung S7 - works fine. So sorry for this false alarm and thanks for releasing a new CLBlast - I can now use it in Caffe on Linux, Windows and Android. |
@gfursin Is this a version using CPU or GPU (OpenCL)? I thought it is saying that the OpenCL is not working on Windows yet (or at least not with Intel iGPU yet). What are you using on Windows? |
Ben sorry for the delay in responding back. I was away. To quote @naibaf7 Thus add entry "engine: INTEL_SPATIAL" to all convolution layer specification. Take AlexNet as an example, edit the file say $CAFFE_ROOT/models/bvlc_alexnet/train_val.prototxt, and add the following line to make conv1 layer to be computed using spatial convolution. Likewise change other layers
Edit: My bad I see you had opened another thread and seems to have progressed a bit more. |
@bxk-sonavex - I use Caffe OpenCL version (with libDNN and CLBlast) on Windows with old Intel 4400 GPU WITHOUT Intel Spatial - it seems to be working fine but it may be suboptimal. Here is the list of Caffe devices ("ck run program:caffe --cmd_key=query_gpu_opencl"): Here is the output from image classification on Windows with above Caffe OpenCL version and GoogleNet: I mostly check inference/object detection at this stage (we are trying to unify DNN installation, benchmarking and optimization across all possible platforms) so I didn't really stress other Caffe capabilities and models on Windows with OpenCL ... I also just tried to compile Caffe OpenCL with Intel Spatial ON ("ck install package:lib-caffe-bvlc-opencl-libdnn-clblast-universal --env.USE_INTEL_SPATIAL=ON") and I observe the same 2 build errors as was reported earlier by @atlury): |
Is there a build script available for Linux (Ubuntu 16.04) too?. I am getting errors when trying to compile |
@rachithayp Follow the instructions carefully, it will work even on 18.0x series. We have tested it. |
Hi @rachithayp . Just a note that you likely need to patch kernel to make Intel OpenCL work on Ubuntu 16.04: https://github.com/dividiti/ck-caffe/wiki/Installation#Intel_CPUGPU_Linux . I managed to build OpenCL branch of Caffe on my Ubuntu 18.04 (Lenovo T470p laptop with Intel GPU) without patching kernel and with the latest Intel OpenCL via CK some weeks ago:
CK will attempt to detect your available compilers, OpenCL libraries and other dependencies, and will invoke cmake for Caffe. If the build is successful, you can check installation using CK virtual env:
You can also try a sample image classification as follows:
Good luck. |
@atlury I was able to compile using the below cmake: But trying to compile with INTEL_SPATIAL_ON is giving below errors: /home/intel/Documents/caffe_src/opencl_caffe/src/caffe/libdnn/libdnn_conv_spatial.cpp:19:1: error: ‘LibDNNConvSpatial’ does not name a type Any idea what could be wrong?. Also there is no include/caffe/greentea folder on the opencl branch, so I copied it from "https://github.com/01org/caffe". |
@rachithayp I hope it will throw some light and help you in your opencl caffe endeavors. |
do your HD 4400 run faster with caffe than CPU? |
I am sorry that I have to open this but both in the opencl github branch and the google forums dont have any kind (updated) step by step installation instructions for installing Caffe Opencl on Intel GPU with Intel Opencl drivers especially for someone new.
(a) Do these instructions still work?
cmake -DUSE_GREENTEA=ON -DUSE_INTEL_SPATIAL=ON -DUSE_ISAAC=ON path_to_caffe_source
make -jn
make -jn runtest
on this branch https://github.com/BVLC/caffe/tree/opencl? or
What about?
cmake -DUSE_GREENTEA=ON -DUSE_INTEL_SPATIAL=ON -DUSE_ISAAC=ON -DBUILD_SHARED_LIBS=OFF -DUSE_CUDNN=OFF -DUSE -DBUILD_docs=OFF -DBUILD_python=OFF -DBUILD_matlab=OFF /root/caffe-opencl
(b) Is atlaspack still needed for compiling opencl-caffe when clblas is there??? It keeps asking for atlaspack???
(c) what about Vienna CL? Does that branch still depend on them? Is it needed?
(D) What is libdnn for? in place of ?
(e) What about ISAAC?
(f) The windows branch for example talks "If CUDA is not installed Caffe will default to a CPU_ONLY build" Does this mean it will not work in Opencl Mode in non-cuda builds??
Kindly update and provide step-by-step instructions
Thank you
The text was updated successfully, but these errors were encountered: