Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

纯C版本和TensorFlow相比性能差距挺大的 #57

Closed
YAMLONG opened this issue Jul 28, 2017 · 5 comments
Closed

纯C版本和TensorFlow相比性能差距挺大的 #57

YAMLONG opened this issue Jul 28, 2017 · 5 comments

Comments

@YAMLONG
Copy link

YAMLONG commented Jul 28, 2017

在mips架构的单核CPU上面,分别运行纯C版本的squeezenet demo,TensorFlow每帧3.9S,NCNN每帧19.8S,这差距有点大啊??

@nihui
Copy link
Member

nihui commented Jul 28, 2017

纯C版本只是当成参考实现,目前只做了 arm neon 优化,其他架构还没有。。

@YAMLONG
Copy link
Author

YAMLONG commented Jul 28, 2017

但是SIMD优化是基于纯C实现的,SIMD的优化倍数是有限的,C的效率不高SIMD并行优化之后效率本身也是有限的,同时我在ARM V7的单核CPU上编译测试了TensorFlow和NCNN,均开了neon优化,squeezenet demo的性能差距也是蛮大的

@YAMLONG
Copy link
Author

YAMLONG commented Jul 28, 2017

能否提供一下你们在ARM参考平台上TensorFlow和NCNN测试的性能对比数据?? 谢谢!

@whtc123
Copy link

whtc123 commented Jul 28, 2017

我测试了海思安防芯片,HI3536平台(4核A17 1.4Ghz),开一个核心450ms。已经不错了

@nihui
Copy link
Member

nihui commented Sep 30, 2017

纯C实现是没有优化的,速度可能比 arm 优化版本慢10倍以上。。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants