Enhance distributed train performance #7608

typhoonzero · 2018-01-17T12:02:52Z

… distributed_split_selectedrows

gongweibao

可以先merge，然后我再继续整理优化代码。

gongweibao · 2018-01-18T11:50:27Z

paddle/operators/recv_op.cc

+      // Get from multiple trainers, we don't care about the order in which
+      // the gradients arrives, just add suffix 0~n and merge the gradient.
+      rpc_service_->SetCond(0);
+      for (size_t i = 0; i < barrier_size; ++i) {


这一块代码太长了，导致main loop比较复杂，容易出错。建议改成单独的一个函数。

gongweibao

LGTM

… distributed_split_selectedrows

gongweibao

LGTM++

typhoonzero added 3 commits January 15, 2018 20:55

dist train support split selectedrows

bcfb82d

enhance dist train performance

02ea349

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

f233b93

… distributed_split_selectedrows

helinwang self-requested a review January 18, 2018 04:39

typhoonzero added 2 commits January 18, 2018 18:27

fix comm issues

ae19d2e

merge codes

5f4d913

typhoonzero requested a review from gongweibao January 18, 2018 11:32

gongweibao reviewed Jan 18, 2018

View reviewed changes

gongweibao previously approved these changes Jan 18, 2018

View reviewed changes

delete debug transpiler code

30529e3

typhoonzero dismissed gongweibao’s stale review via 30529e3 January 18, 2018 12:03

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

0aff136

… distributed_split_selectedrows

gongweibao approved these changes Jan 19, 2018

View reviewed changes

typhoonzero merged commit 58be41f into PaddlePaddle:develop Jan 19, 2018

typhoonzero deleted the distributed_split_selectedrows branch January 19, 2018 03:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance distributed train performance #7608

Enhance distributed train performance #7608

typhoonzero commented Jan 17, 2018

gongweibao left a comment

gongweibao Jan 18, 2018

gongweibao left a comment

gongweibao left a comment

Enhance distributed train performance #7608

Enhance distributed train performance #7608

Conversation

typhoonzero commented Jan 17, 2018

gongweibao left a comment

Choose a reason for hiding this comment

gongweibao Jan 18, 2018

Choose a reason for hiding this comment

gongweibao left a comment

Choose a reason for hiding this comment

gongweibao left a comment

Choose a reason for hiding this comment