Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware #143

Open
gaocegege opened this issue Apr 25, 2019 · 3 comments

Comments

@gaocegege
Copy link
Member

gaocegege commented Apr 25, 2019

https://arxiv.org/abs/1812.00332

https://github.com/MIT-HAN-LAB/ProxylessNAS

一作 https://han-cai.github.io/ 交大 ACM 班的

二作 http://lzhu.me/

@gaocegege
Copy link
Member Author

gaocegege commented Apr 25, 2019

https://www.jiqizhixin.com/articles/2018-12-07-8

这篇文章挺有参考价值的,idea 很有趣。机器之心文章里有一个地方写的不清楚比较影响阅读,DARTS 需要的不是内存是 GPU memory(显存)

文章值得关注的一个点,是它把硬件的 latency 用一个连续函数表示了一下,变得可微,然后作为 loss 进行了训练。但是文章中并没有看到 GPU 和 CPU 的 latency 函数到底是什么样子的

@gaocegege
Copy link
Member Author

他跟 DARTS 等比较类似,都是重在 Suggestion 这边的 train,真正去 evaluate 的时候是不训练的,但是利用了文章里所说的,path binarization。这样降低了训练的成本。

@gaocegege
Copy link
Member Author

gaocegege commented May 20, 2019

https://www.zhihu.com/question/296404213/answer/547163236 值得一读

常见的 proxy

  • 先在小数据集(CIFAR)上搜索,然后迁移到大数据集(imagenet)。
  • 先搜索一个比较浅的网络,然后重复堆叠同样的结构单元来得到更深的网络。
  • 只做训练少量 epoch (e.g. 5 epoch),然后就 validate。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant