-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add design doc for FPGA on Paddle #4027
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Baidu FPGA on PaddlePaddle: Design Doc | ||
|
||
我们计划利用百度FPGA云主机和深度学习加速库Polaris,将FPGA集成到PaddlePaddle,使其支持更多的异构硬件。 | ||
- 百度FPGA云服务器:百度云环境中配备的FPGA计算实例,通过该实例,可以快速地构建FPGA硬件加速程序[1]。 | ||
- Polaris:一个百度开发的基于FPGA的高性能深度学习计算库,供用户方便的调用百度FPGA实现的功能\[[2](#references)\]。 | ||
|
||
|
||
|
||
我们计划基于重构中的PaddlePaddle进行开发,目标是: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. design doc不用说计划。 |
||
|
||
- 百度FPGA支持大部分常见的深度学习Operator。 | ||
- 百度FPGA支持大部分常见的深度学习模型。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 请问“常见的”:包括图像、NLP的都支持么? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 我们计划都包含 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 可以在文档中强调一下,包含图像、NLP、语言等,我可能没列全。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 好的,thx |
||
|
||
|
||
## Contents | ||
|
||
- [Overview](#overview) | ||
- [Actions](#actions) | ||
- [CMake](#cmake) | ||
- [Layers](#layers) | ||
- [Activations](#activations) | ||
- [Unit Tests](#unit-tests) | ||
- [Protobuf Messages](#protobuf-messages) | ||
- [Python API](#python-api) | ||
- [Demos](#demos) | ||
- [Benchmarking](#benchmarking) | ||
- [Others](#others) | ||
|
||
## Overview | ||
|
||
我们会把Polaris作为第三方库集成进PaddlePaddle,利用百度FPGA云服务器能够快速构建基于FPGA的PaddlePaddle应用,整体框架图如下所示: | ||
<div align="center"> | ||
<img src="image/overview.png" width=350><br/> | ||
Figure 1. FPGA on Paddle. | ||
</div> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这幅图可以缩小一点。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 好 |
||
|
||
## Actions | ||
我们把集成方案大致分为了如下几个方面。 | ||
|
||
### CMake | ||
我们会在`CMakeLists.txt`中会添加`WITH_FPGA`的选项,当设置这个值为`ON`的时候会启用FPGA编译功能。我们会在`cmake/external`目录新建`polaris.cmake`文件,它们会在编译PaddlePaddle的时候下载Polaris头文件和库,并放到PaddlePaddle的third_party目录中。 | ||
|
||
### Place | ||
重构中的PaddlePaddle用Place类来标注数据在什么设备上,目前支持CPUPlace和GPUPlace。我们会在place.h中添加FPGAPlace,标注数据在FPGA设备上,并在其中添加int类型的deviceid来支持FPGA多设备。同时,我们参照CPUPlace和GPUPlace的实现,提供is_fpga_place等方法。 | ||
|
||
**备注**:重构中的PaddlePaddle有一些代码默认设备不是GPU就是CPU,如: | ||
```C++ | ||
bool places_are_same_class(const Place &p1, const Place &p2) { | ||
return is_gpu_place(p1) == is_gpu_place(p2); | ||
} | ||
``` | ||
添加FPGAPlace后需要修改这部分代码的逻辑。 | ||
|
||
### Memory | ||
重构中的PaddlePaddle实现了buddy memomry allocator类来负责内存的管理,目前只支持CPU和GPU的版本。我们会添加一个FPGAAllocator来负责FPGA的内存管理。 | ||
|
||
### DeviceContext | ||
重构中的PaddlePaddle实现了DeviceContext类来负责对应设备上面的资源管理,目前只支持CPU和GPU的版本。我们会添加一个FPGADeviceContext来负责管理FPGA设备的资源。 | ||
|
||
### Operator | ||
重构中的PaddlePaddle是基于Operator的,不同设备(CPU、GPU)共享一个Op定义,OpKernel提供Compute方法\[[3](#references))\],我们会在现有XXX_op.cc文件中添加FPGAKernel来实现FPGA的计算逻辑。对于FPGA不支持的操作,我们在compute函数中加入判断: | ||
```C++ | ||
PADDLE_ENFORCE(!platform::is_fpga_place(ctx.GetPlace()), | ||
"It can not use FPGAPlace."); | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FPGA的op,能写成FPGAXXX_op.cc么?即不和原来的XXX_op.cc写在一块,主要考虑:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
这个没有问题,但是如果支持的OP多了的话会有比较多的FPGAXXX_op.cc文件,这个有问题吗? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 没问题,这样比较清晰。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @luotao1 以lookup_table为例,现在GPU注册代码是 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 有个建议,添加XXX_op_fpga.cc来实现FPGA的逻辑,而不是FPGAXXX_op.cc是不是更好?主要考虑是同样一个op,在目录结构上cpu,gpu,fpga的实现会连在一起。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @QingshuChen 确实这样会更清楚些 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
因为我对fpga不是很了解,我以MKLDNN目前的集成工作为例来表达下我的观点:
@QingshuChen XXX_op_fpga.cc和XXX_op_fpga.h确实更好一点。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @shijiaxin 目前感觉当成一个kernel,注册到原有的op上面没啥问题。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
可以在文档中强调说明一下,如果有关于格式说明的官方链接,贴出来最好。那么可以加入XXX_op_fpga.cc,在里面实现XXXFPGAKernel,代码分离,不需要op分离。 |
||
|
||
**备注**:由于FPGA不如CPU和GPU灵活,因此FPGA只能支持大部分Operator,无法支持所有的Operator。 | ||
|
||
### Net | ||
Net是包含了一系列的Operator,目前一个Net上的Operator只能运行在同一个设备上。由于FPGA不够灵活,部分Operator可能不支持FPGA,因此需要提供类似Parallel-nn的方法,即部分Operator在FPGA运行,部分Operator在CPU或GPU上运行。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 关于多设备执行的一个讨论 #4031 ,还在design中,可以一起来讨论 |
||
|
||
### UnitTest | ||
FPGA相关代码的单元测试会添加到对应修改的模块中。例如在添加了FPGAPlace,就需要在place_test.cc中添加FPGA的单元测试。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 单测可能是一个问题,因为ci中应该还不支持FPGA的设备。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 我们提供一台FPGA云主机,在上面进行CI测试,这种方式可以吗? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这种方式是可以的。但一台够么?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 赞!可以的,可能上面要搭一个teamcity。 |
||
|
||
### Python API | ||
目前只考虑v2 API,使用FPGA时只需在设置Place时设置成FPGA,其他配置不变。 | ||
```python | ||
place = core.FPGAPlace() | ||
``` | ||
|
||
## References | ||
1. [百度FPGA云服务器](https://cloud.baidu.com/product/fpga.html) | ||
2. [Baidu Polaris Project](http://fpga.baidu.com/) | ||
3. [如何写新的Operator](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/new_op_cn.md#%E5%AE%9A%E4%B9%89OpKernel%E7%B1%BB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请问这个Polaris库是开源的么?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Polaris库会在github上面开放头文件和.a文件
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
那能否改成:
Polaris:一个百度开发的基于FPGA的高性能深度学习计算库(待开源,包含头文件和.a文件)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好。