-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
纯C版本和TensorFlow相比性能差距挺大的 #57
Comments
纯C版本只是当成参考实现,目前只做了 arm neon 优化,其他架构还没有。。 |
但是SIMD优化是基于纯C实现的,SIMD的优化倍数是有限的,C的效率不高SIMD并行优化之后效率本身也是有限的,同时我在ARM V7的单核CPU上编译测试了TensorFlow和NCNN,均开了neon优化,squeezenet demo的性能差距也是蛮大的 |
能否提供一下你们在ARM参考平台上TensorFlow和NCNN测试的性能对比数据?? 谢谢! |
我测试了海思安防芯片,HI3536平台(4核A17 1.4Ghz),开一个核心450ms。已经不错了 |
纯C实现是没有优化的,速度可能比 arm 优化版本慢10倍以上。。 |
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
在mips架构的单核CPU上面,分别运行纯C版本的squeezenet demo,TensorFlow每帧3.9S,NCNN每帧19.8S,这差距有点大啊??
The text was updated successfully, but these errors were encountered: