Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PIR][AutoParallel] Support merge fetch results among micro_scopes #58528

Merged
merged 11 commits into from
Nov 3, 2023

Conversation

zhaoyinglia
Copy link
Contributor

@zhaoyinglia zhaoyinglia commented Oct 31, 2023

PR types

Others

PR changes

Others

Description

Pcard-76459

  • support the multi-scope fetch result merge
  • add FetchTensors to get all multi-scope fetch results
  • add MergeFetchTensors to merge fetch results of the same name and different scope

@zhaoyinglia zhaoyinglia changed the title [PIR][AutoParallel] Support MicroScope Merge Fetch Result [PIR][AutoParallel] Support Merge Fetch Result among MicroScopes Oct 31, 2023
@zhaoyinglia zhaoyinglia changed the title [PIR][AutoParallel] Support Merge Fetch Result among MicroScopes [PIR][AutoParallel] Support merge fetch results among micro_scopes Oct 31, 2023
@paddle-bot paddle-bot bot added the contributor External developers label Oct 31, 2023
From00
From00 previously approved these changes Nov 2, 2023
Copy link
Contributor

@From00 From00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

Scope* scope,
FetchUnmergedList* fetch_list);

void MergeFetchTensors(const FetchUnmergedList& fetch_list,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void MergeFetchTensors(const FetchUnmergedList& fetch_list,
void MergeFetchList(const FetchUnmergedList& fetch_list,

@@ -71,11 +54,17 @@ class Job final {
skip_gc_vars_ = skip_gc_vars;
}

void SetFetchVarName(std::string fetch_var_name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void SetFetchVarName(std::string fetch_var_name) {
void SetFetchVarName(const std::string& fetch_var_name) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (need_fetch) {
for (auto& var_name : fetch_var_names_) {
auto* var = inner_scope->FindVar(var_name);
VLOG(0) << "fetch " << var_name << "[" << var << "]";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VLOG(0) << "fetch " << var_name << "[" << var << "]";
VLOG(6) << "fetch " << var_name << "[" << var << "]";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

feed_names[i],
numel_size,
micro_batch_num));
int64_t split_size = (numel_size + micro_batch_num - 1) / micro_batch_num;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int64_t split_size = (numel_size + micro_batch_num - 1) / micro_batch_num;
int64_t split_size = numel_size / micro_batch_num;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

VLOG(4) << "Split feed data:" << feed_names[i] << ", dims:("
<< feed_tensor.dims() << "), micro_batch_num:" << micro_batch_num;
for (int64_t j = 0; j < micro_batch_num; ++j) {
(*out)[j].resize(i + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以直接分配feed_tensors.size个元素空间

FetchList* out);

void MergeTensors(const std::vector<const phi::DenseTensor*>& tensors,
platform::Place dst_place,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
platform::Place dst_place,
const platform::Place& dst_place,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@paddle-bot paddle-bot bot removed the contributor External developers label Nov 3, 2023
Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiqiu zhiqiu merged commit 5d707be into PaddlePaddle:develop Nov 3, 2023
zeroRains pushed a commit to zeroRains/Paddle that referenced this pull request Nov 8, 2023
…addlePaddle#58528)

* [PIR][AutoParallel] Support MicroScope Merge Fetch Result

* fix func name

* revert plan

* fix feed

* tiny fix

* rm print&update ut

* rm comment

* fix splitfeed & ut

* add mergetensor func

* fix ut

* update split_size
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
…addlePaddle#58528)

* [PIR][AutoParallel] Support MicroScope Merge Fetch Result

* fix func name

* revert plan

* fix feed

* tiny fix

* rm print&update ut

* rm comment

* fix splitfeed & ut

* add mergetensor func

* fix ut

* update split_size
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants