Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add init interface for customize devices. #10167

Merged
merged 9 commits into from
Apr 27, 2018

Conversation

wanghaoshuang
Copy link
Contributor

@wanghaoshuang wanghaoshuang commented Apr 24, 2018

在当前单测实现逻辑中,RUN_ALL_TESTS()前需要初始化所有device,这与"初始化部分devices"的单测有所冲突,所以这个pr没有对void InitDevices(bool init_p2p, const std::vector<int> devices)进行单测。

在线下,通过注释掉这句,通过了以下单测:

 TEST(InitDevices, single_CUDA) {
    using paddle::framework::Init;
    using paddle::platform::DeviceContextPool;
 
  #ifdef PADDLE_WITH_CUDA
    int count = paddle::platform::GetCUDADeviceCount();
    std::vector<char*> argvs;
    argvs.push_back(strdup("--devices=0,1"));
    int argc = static_cast<int>(argvs.size());
    char** argv = argvs.data();
    Init(&argc, &argv);
    DeviceContextPool& pool = DeviceContextPool::Instance();
    ASSERT_EQ(pool.size(), 3U);
  #endif
  }

@@ -27,6 +27,8 @@ namespace inference {

void Init(bool init_p2p);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个Init接口可以删掉了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

#endif

for (size_t i = 0; i < devices.size(); ++i) {
if (devices[i] >= count) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

devices[i]必须:

  1. > 0
  2. 不能重复
    这个能不能保证下?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Xreki DeviceContextPool有对places去重,而且CUDAPlace按device_id判断是否相等:https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/platform/place.h#L41
所以,我们不用在这里check是否重复


void Init(int argc, char **argv) {
std::call_once(gflags_init_flag,
[&]() { google::ParseCommandLineFlags(&argc, &argv, true); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里调用InitGflags,然后修复下IntiGflags的问题?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thx.


#include "paddle/fluid/framework/init.h"
#include "paddle/fluid/framework/operator.h"
#include "paddle/fluid/platform/device_context.h"
#include "paddle/fluid/platform/device_context.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

头文件重复了?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix.

@@ -64,6 +85,30 @@ void InitP2P(int count) {
#endif
}

void InitP2P(std::vector<int> devices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line67行的InitP2P函数,可以直接调用该函数吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thx.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我的意思是:void InitP2P(int count) 和 你新加的 InitP2P(std::vector<int> devices) 内部实现几乎一样,可以先把 std::vector<int> devices构造出来,直接调用新增的这个。 或者上面那个还有必要存在吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qingqing01 明白了,我改成在InitDevices(bool initP2P) 里调用InitDevices(bool initP2P, vector<int> devices), 然后void InitP2P(int count)也就没用了,已将其删除。

@Xreki Xreki added the 预测 原名Inference,包含Capi预测问题等 label Apr 24, 2018
@@ -25,7 +25,7 @@ limitations under the License. */
namespace paddle {
namespace inference {

void Init(bool init_p2p);
void Init(const std::vector<std::string> argv);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there are two Init interfaces, one in paddle/fluid/inference/io.h, another in paddle/fluid/framework/init.h ?

Copy link
Contributor Author

@wanghaoshuang wanghaoshuang Apr 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddle/fluid/framework/init.h中的init确实多余,也没有被其它地方用到,已经将其删除。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qingqing01 @wanghaoshuang

// Temporarily add this function for exposing framework::InitDevices() when
// linking the inference shared library.
void Init(bool init_p2p) { framework::InitDevices(init_p2p); }

这里有注释。因为当前libpaddle_fluid.so不再使用whole-archive链接,而framework/init.h中的函数,没有被Fluid其他的C++代码调用到,在链接生成libpaddle_fluid.so的时候,framework/init.h里面的符号就没有链接进来。用户inference代码里面需要显式调用paddle::framework::InitDevices,在使用libpaddle_fluid.so的时候,会出现undefined symbols paddle::framework::InitDevices的错误。

@@ -31,6 +30,7 @@ std::once_flag p2p_init_flag;

void InitGflags(std::vector<std::string> argv) {
std::call_once(gflags_init_flag, [&]() {
argv.push_back("dummy");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是应该插入到argv[0]之前?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,感谢提醒。

#include "paddle/fluid/framework/block_desc.h"
#include "paddle/fluid/framework/feed_fetch_type.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/fluid/pybind/pybind.h"

DEFINE_string(devices, "", "The devices to be used.");
Copy link
Contributor

@qingqing01 qingqing01 Apr 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释里需要说明分割符是什么

#include "paddle/fluid/framework/block_desc.h"
#include "paddle/fluid/framework/feed_fetch_type.h"
#include "paddle/fluid/framework/op_registry.h"
#include "paddle/fluid/pybind/pybind.h"

DEFINE_string(devices, "", "The devices to be used.");
DEFINE_bool(init_p2p, true, "Whether to init p2p.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

默认是False吧,单机单卡的行为不需要p2p吧。

Copy link
Contributor

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But @Xreki please have a look again.

Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and thanks very much.

@wanghaoshuang wanghaoshuang merged commit 2f9fa9b into PaddlePaddle:develop Apr 27, 2018
@wanghaoshuang wanghaoshuang deleted the fluid_init branch May 20, 2022 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
预测 原名Inference,包含Capi预测问题等
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants