-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatiblity issue with CUDA 8.0 #18
Comments
duplicated with #3 . CUDA 8.0 support issue is under resolving. |
Please track #3 for recent progresses. |
Hi, @reyoung The problem I posted here is an issue on using PaddlePaddle AFTER SUCCESSFULLY BUILDING THE CODE, which is different from #3 you try to referenced. As far as I know, guy who reported an issue of the building process with CUDA 8.0! That issue can be easily shot down by my merged PR #15 . So I don't think the two issue can be treated like the same problem. Appended is my nvcc version for your reference:
Hope it helps :) |
Hi, invalid device function indicates that you have a CUDA / GPU incompatibility. open cmake/flags.cmake and add following code: list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_60") endif() then, rebuild the project |
Thanks for your reply @gangliao ! |
@gangliao same errors when running I appended the lines at the end of if (CUDA_VERSION VERSION_GREATER "7.0")
list(APPEND __arch_flags " -gencode arch=compute_52,code=sm_52")
endif()
if(CUDA_VERSION VERSION_GREATER "8.0")
list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_52")
endif()
set(CUDA_NVCC_FLAGS ${__arch_flags} ${CUDA_NVCC_FLAGS}) and run the same building steps and it did success. P.S.: Same error when directly replaced # if (CUDA_VERSION VERSION_GREATER "7.0")
# list(APPEND __arch_flags " -gencode arch=compute_52,code=sm_52")
# endif()
if(CUDA_VERSION VERSION_GREATER "8.0")
list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_52")
endif()
set(CUDA_NVCC_FLAGS ${__arch_flags} ${CUDA_NVCC_FLAGS}) |
@stoneyang Why do you set sm_52? how about if (CUDA_VERSION VERSION_GREATER "7.0")
list(APPEND __arch_flags " -gencode arch=compute_52,code=sm_52")
endif()
if(CUDA_VERSION VERSION_GREATER "8.0")
list(APPEND __arch_flags " -gencode arch=compute_60,code=sm_60")
endif()
set(CUDA_NVCC_FLAGS ${__arch_flags} ${CUDA_NVCC_FLAGS}) |
@gangliao sorry for the typo .... I changed those lines (see PR #40). And the new CUDA code generation configuration works fine, which is different from your suggestion. :) foreach(capability 30 35 50 52 60)
list(APPEND __arch_flags " -gencode arch=compute_${capability},code=sm_${capability}")
endforeach() Please note that the former But new error occurs when running: $ sh train.sh The log:
Looks like an cuDNN issue about the function |
GPU K20/40 works fine on this demo. Currently, this bug only appeared on GPU GTX, one of my colleagues already reproduced it and will solve it soon. Thanks. |
Thanks for reopening this issue! @gangliao Appended is the version of cuDNN: Hope it is helpful. |
Fixed #107, and closed. |
Same issue as the original report still persists after new commits were fetched and re- |
@stoneyang Can you try adding GLOG_v=3 option on the command line(like this GLOG_v=3 paddle ...). This will print out more debug information. |
@hedaoyuan , Hi, thanks for your reply! Do you mean add
|
@stoneyang fixed Fix CUDA_VERSION Comparsion #165 |
@gangliao , seems perfect now! |
add cluster quickstart
Add models api
…backward Merge pull request PaddlePaddle#18 from joey12300/add-lstm-cudnn_cpu_backward
speed up graph neighbors sampling
* compiler test pr * Init compiler dir. * init version for piano_op_search_and_replace_pass * save status, compile ok * add some test for piano_search_and_replace_pass * basic test case is ok * pass unittest * piano_search_and_replace_pass to piano_op_search_and_replace_pass * add some comments for piano_op_search_and_replace_pass * add links between cluster_inputs/cluster_outputs and compiled_cluster * use const& cluster_inputs instead of * p_cluster_inputs * add some comments and encapsulate some sub functions * rename piano_op_search_and_replace_pass to piano_compile_pass * a small modify in comments Co-authored-by: Zhen Wang <wangzhen31@baidu.com>
add tile transform
add google news word embedding
add trt int8 tutorial and demos
* Optimize memory overhead for gpugraph. * Optimize memory overhead for gpugraph.
delete update notes from README.md
[DRR] Refactor ther drr pattern class
update xdnn version
[MTAI-484] fix(build): change commit id in eigen.cmake
Slightly improve installation process
Update to v2.0.8
Hi, there,
I can successfully build Paddle on my machine installed Linux 14.04 LTS and CUDA 8.0 as the official guide. And for sure, the CPU version runs well except the speed....
When I ran the image classification demo with the script
train.sh
in GPU mode (seetrain.sh
for more details), it unfortunately failed and threw out the following info:It seems that Paddle still does not support the latest version of CUDA....
Appended my
train.sh
as a clue:The text was updated successfully, but these errors were encountered: