-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add init interface for customize devices. #10167
Conversation
paddle/fluid/inference/io.h
Outdated
@@ -27,6 +27,8 @@ namespace inference { | |||
|
|||
void Init(bool init_p2p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个Init接口可以删掉了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/fluid/framework/init.cc
Outdated
#endif | ||
|
||
for (size_t i = 0; i < devices.size(); ++i) { | ||
if (devices[i] >= count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
devices[i]
必须:
> 0
- 不能重复
这个能不能保证下?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Xreki DeviceContextPool有对places去重,而且CUDAPlace按device_id
判断是否相等:https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/platform/place.h#L41
所以,我们不用在这里check是否重复
paddle/fluid/framework/init.cc
Outdated
|
||
void Init(int argc, char **argv) { | ||
std::call_once(gflags_init_flag, | ||
[&]() { google::ParseCommandLineFlags(&argc, &argv, true); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里调用InitGflags
,然后修复下IntiGflags
的问题?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thx.
paddle/fluid/framework/init.cc
Outdated
|
||
#include "paddle/fluid/framework/init.h" | ||
#include "paddle/fluid/framework/operator.h" | ||
#include "paddle/fluid/platform/device_context.h" | ||
#include "paddle/fluid/platform/device_context.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
头文件重复了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix.
paddle/fluid/framework/init.cc
Outdated
@@ -64,6 +85,30 @@ void InitP2P(int count) { | |||
#endif | |||
} | |||
|
|||
void InitP2P(std::vector<int> devices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line67行的InitP2P
函数,可以直接调用该函数吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我的意思是:void InitP2P(int count) 和 你新加的 InitP2P(std::vector<int> devices)
内部实现几乎一样,可以先把 std::vector<int> devices
构造出来,直接调用新增的这个。 或者上面那个还有必要存在吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qingqing01 明白了,我改成在InitDevices(bool initP2P)
里调用InitDevices(bool initP2P, vector<int> devices)
, 然后void InitP2P(int count)
也就没用了,已将其删除。
@@ -25,7 +25,7 @@ limitations under the License. */ | |||
namespace paddle { | |||
namespace inference { | |||
|
|||
void Init(bool init_p2p); | |||
void Init(const std::vector<std::string> argv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why there are two Init
interfaces, one in paddle/fluid/inference/io.h
, another in paddle/fluid/framework/init.h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddle/fluid/framework/init.h
中的init
确实多余,也没有被其它地方用到,已经将其删除。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paddle/paddle/fluid/inference/io.cc
Lines 27 to 29 in c6a7042
// Temporarily add this function for exposing framework::InitDevices() when | |
// linking the inference shared library. | |
void Init(bool init_p2p) { framework::InitDevices(init_p2p); } |
这里有注释。因为当前libpaddle_fluid.so
不再使用whole-archive
链接,而framework/init.h
中的函数,没有被Fluid其他的C++代码调用到,在链接生成libpaddle_fluid.so
的时候,framework/init.h
里面的符号就没有链接进来。用户inference代码里面需要显式调用paddle::framework::InitDevices
,在使用libpaddle_fluid.so
的时候,会出现undefined symbols paddle::framework::InitDevices
的错误。
paddle/fluid/framework/init.cc
Outdated
@@ -31,6 +30,7 @@ std::once_flag p2p_init_flag; | |||
|
|||
void InitGflags(std::vector<std::string> argv) { | |||
std::call_once(gflags_init_flag, [&]() { | |||
argv.push_back("dummy"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不是应该插入到argv[0]
之前?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,感谢提醒。
paddle/fluid/inference/io.cc
Outdated
#include "paddle/fluid/framework/block_desc.h" | ||
#include "paddle/fluid/framework/feed_fetch_type.h" | ||
#include "paddle/fluid/framework/op_registry.h" | ||
#include "paddle/fluid/pybind/pybind.h" | ||
|
||
DEFINE_string(devices, "", "The devices to be used."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注释里需要说明分割符是什么
paddle/fluid/inference/io.cc
Outdated
#include "paddle/fluid/framework/block_desc.h" | ||
#include "paddle/fluid/framework/feed_fetch_type.h" | ||
#include "paddle/fluid/framework/op_registry.h" | ||
#include "paddle/fluid/pybind/pybind.h" | ||
|
||
DEFINE_string(devices, "", "The devices to be used."); | ||
DEFINE_bool(init_p2p, true, "Whether to init p2p."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
默认是False吧,单机单卡的行为不需要p2p吧。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But @Xreki please have a look again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and thanks very much.
在当前单测实现逻辑中,
RUN_ALL_TESTS()
前需要初始化所有device,这与"初始化部分devices"的单测有所冲突,所以这个pr没有对void InitDevices(bool init_p2p, const std::vector<int> devices)
进行单测。在线下,通过注释掉这句,通过了以下单测: