Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Semi-Auto] Add reduction spmd rule #54991

Merged
merged 4 commits into from
Jul 7, 2023

Conversation

pkuzyc
Copy link
Contributor

@pkuzyc pkuzyc commented Jun 29, 2023

PR types

New features

PR changes

Others

Description

Pcard-70448

  1. Modify the ExtractAttr common function to deal with cases about the miss match of python and cpp type (e.g. python bool -> cpp int)
  2. Add reduction op's spmd rule for inferring distributed attributes. Implement the InferForward function for reduction op, i.e. infer output tensor's distributed attributes from input tensors'.

@paddle-bot
Copy link

paddle-bot bot commented Jun 29, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@pkuzyc pkuzyc marked this pull request as draft June 29, 2023 08:28
@pkuzyc pkuzyc marked this pull request as ready for review July 1, 2023 09:37
@@ -24,6 +25,7 @@ namespace auto_parallel {

// matmul rule
REGISTER_SPMD_RULE(matmul, MatmulSPMDRule);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

register with op_name specifically

reduce_sum, sum, reduce_max, reduce_min
mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// step1: Build Einsum Notation
bool keep_dim = ExtractAttr<bool>("keep_dim", attrs);
// bool keep_dim = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

bool keep_dim = ExtractAttr<bool>("keep_dim", attrs);
// bool keep_dim = false;
std::vector<int64_t> reduce_dims =
ExtractAttr<std::vector<int64_t>>("dim", attrs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where the "dim" come from ? should it be "axis" ?

Phi API:

  • op : max
    args : (Tensor x, IntArray axis={}, bool keepdim=false)
  • op : mean
    args : (Tensor x, IntArray axis={}, bool keepdim=false)
  • op : sum
    args : (Tensor x, IntArray axis={}, DataType dtype=DataType::UNDEFINED, bool keepdim=false)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified to "axis" now. "dim" is the attribute name in static mode.

std::vector<TensorDistAttr> new_input_dist_attrs;
std::vector<TensorDistAttr> output_dist_attrs;
output_dist_attrs.emplace_back(output_dist_attr);
// step2.3: update the input dist_attr if reshard is needed. When
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the replicate logic for reduce axis is wrong.

if the op is a "linearity"(e.g. sum, all, mean), and the reduce dim is shared, there is no need to reshard the reduce axis as replicated, and we just need mark this axis of output tensor as "Partial" by the sharding mesh dim.

if the op is a "non-linearity"(e.g. variance, ), replicate logic is need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

bool keep_dim = ExtractAttr<bool>("keep_dim", attrs);
// bool keep_dim = false;
std::vector<int64_t> reduce_dims =
ExtractAttr<std::vector<int64_t>>("dim", attrs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be another attribute for reduce,

"reduce_type": sum, max, min, mean
"linearity": true/false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a "linearity" attribute.


// step2.4: handle partial
// Step2.4.1 Output Partial
std::vector<int64_t> partial_on_dims =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use this logic to infer the partial dim.

if a axis is missing in output tensor, and this axis is sharded in input tensor,
this axis would be Partial on the dim in output tensor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@pkuzyc pkuzyc force-pushed the new_reduction branch 2 times, most recently from 4bd765c to b04a262 Compare July 6, 2023 06:56

// step2.4: handle partial
// Step2.4.1 Output Partial
// If the op is a linear op, i.e. `linearity` is true, the output's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-linear op requires its input to be non-partial. but could generate partial output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@pkuzyc pkuzyc closed this Jul 7, 2023
@pkuzyc pkuzyc reopened this Jul 7, 2023
Copy link
Contributor

@JZ-LIANG JZ-LIANG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JZ-LIANG JZ-LIANG merged commit 35b72e8 into PaddlePaddle:develop Jul 7, 2023
@pkuzyc pkuzyc deleted the new_reduction branch July 12, 2023 03:56
cqulilujia pushed a commit to cqulilujia/Paddle that referenced this pull request Jul 24, 2023
* add reduction spmd rule for auto parallel

* fix the logic of handling partial

* fix code style

* fix the partial handling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants