New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Semi-Auto] Add reduction spmd rule #54991

Merged

JZ-LIANG merged 4 commits into PaddlePaddle:develop from pkuzyc:new_reduction

Jul 7, 2023

Contributor

pkuzyc commented Jun 29, 2023 •

edited

Loading

PR types

New features

PR changes

Others

Description

Pcard-70448

Modify the ExtractAttr common function to deal with cases about the miss match of python and cpp type (e.g. python bool -> cpp int)
Add reduction op's spmd rule for inferring distributed attributes. Implement the InferForward function for reduction op, i.e. infer output tensor's distributed attributes from input tensors'.

paddle-bot bot commented Jun 29, 2023

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

pkuzyc marked this pull request as draft

June 29, 2023 08:28

pkuzyc force-pushed the new_reduction branch from 0a91e04 to 82e09a7 Compare

July 1, 2023 09:33

pkuzyc marked this pull request as ready for review

July 1, 2023 09:37

JZ-LIANG reviewed

View reviewed changes

paddle/fluid/distributed/auto_parallel/spmd_rules/rules.h

    
            @@ -24,6 +25,7 @@ namespace auto_parallel {
          
              // matmul rule

              REGISTER_SPMD_RULE(matmul, MatmulSPMDRule);

Contributor

JZ-LIANG Jul 3, 2023

register with op_name specifically

reduce_sum, sum, reduce_max, reduce_min
mean

Contributor Author

pkuzyc Jul 4, 2023

Done

paddle/fluid/distributed/auto_parallel/spmd_rules/reduction_spmd_rule.cc Outdated

    
                // step1: Build Einsum Notation

                bool keep_dim = ExtractAttr<bool>("keep_dim", attrs);

                // bool keep_dim = false;

Contributor

JZ-LIANG Jul 3, 2023

remove it

Contributor Author

pkuzyc Jul 4, 2023

Done

paddle/fluid/distributed/auto_parallel/spmd_rules/reduction_spmd_rule.cc Outdated

    
                bool keep_dim = ExtractAttr<bool>("keep_dim", attrs);

                // bool keep_dim = false;

                std::vector<int64_t> reduce_dims =

                    ExtractAttr<std::vector<int64_t>>("dim", attrs);

Contributor

JZ-LIANG Jul 3, 2023

where the "dim" come from ? should it be "axis" ?

Phi API:

op : max
args : (Tensor x, IntArray axis={}, bool keepdim=false)
op : mean
args : (Tensor x, IntArray axis={}, bool keepdim=false)
op : sum
args : (Tensor x, IntArray axis={}, DataType dtype=DataType::UNDEFINED, bool keepdim=false)

Contributor Author

pkuzyc Jul 4, 2023

modified to "axis" now. "dim" is the attribute name in static mode.

paddle/fluid/distributed/auto_parallel/spmd_rules/reduction_spmd_rule.cc Outdated

    
                std::vector<TensorDistAttr> new_input_dist_attrs;

                std::vector<TensorDistAttr> output_dist_attrs;

                output_dist_attrs.emplace_back(output_dist_attr);

                // step2.3: update the input dist_attr if reshard is needed. When

Contributor

JZ-LIANG Jul 3, 2023

the replicate logic for reduce axis is wrong.

if the op is a "linearity"(e.g. sum, all, mean), and the reduce dim is shared, there is no need to reshard the reduce axis as replicated, and we just need mark this axis of output tensor as "Partial" by the sharding mesh dim.

if the op is a "non-linearity"(e.g. variance, ), replicate logic is need.

Contributor Author

pkuzyc Jul 4, 2023

Done

paddle/fluid/distributed/auto_parallel/spmd_rules/reduction_spmd_rule.cc Outdated

    
                bool keep_dim = ExtractAttr<bool>("keep_dim", attrs);

                // bool keep_dim = false;

                std::vector<int64_t> reduce_dims =

                    ExtractAttr<std::vector<int64_t>>("dim", attrs);

Contributor

JZ-LIANG Jul 3, 2023

there should be another attribute for reduce,

"reduce_type": sum, max, min, mean
"linearity": true/false

Contributor Author

pkuzyc Jul 4, 2023

added a "linearity" attribute.

paddle/fluid/distributed/auto_parallel/spmd_rules/reduction_spmd_rule.cc

    
                // step2.4: handle partial

                // Step2.4.1 Output Partial

                std::vector<int64_t> partial_on_dims =

Contributor

JZ-LIANG Jul 3, 2023

should use this logic to infer the partial dim.

if a axis is missing in output tensor, and this axis is sharded in input tensor,
this axis would be Partial on the dim in output tensor.

Contributor Author

pkuzyc Jul 4, 2023

Done

pkuzyc force-pushed the new_reduction branch 2 times, most recently from 4bd765c to b04a262 Compare

July 6, 2023 06:56

JZ-LIANG reviewed

View reviewed changes

paddle/fluid/distributed/auto_parallel/spmd_rules/reduction_spmd_rule.cc Outdated

    
                // step2.4: handle partial

                // Step2.4.1 Output Partial

                // If the op is a linear op, i.e. `linearity` is true, the output's

Contributor

JZ-LIANG Jul 6, 2023

non-linear op requires its input to be non-partial. but could generate partial output.

Contributor Author

pkuzyc Jul 7, 2023

Done

pkuzyc force-pushed the new_reduction branch from b04a262 to b0bef51 Compare

July 6, 2023 13:03

pkuzyc added 4 commits

July 7, 2023 14:34


          add reduction spmd rule for auto parallel

85883a5


          fix the logic of handling partial

82c219b


          fix code style

54dfb57


          fix the partial handling

a48c7db

pkuzyc force-pushed the new_reduction branch from b0bef51 to a48c7db Compare

July 7, 2023 06:34

pkuzyc closed this

pkuzyc reopened this

JZ-LIANG approved these changes

View reviewed changes

Contributor

JZ-LIANG left a comment

LGTM

JZ-LIANG merged commit 35b72e8 into PaddlePaddle:develop

pkuzyc deleted the new_reduction branch

July 12, 2023 03:56

cqulilujia pushed a commit to cqulilujia/Paddle that referenced this pull request


          [Semi-Auto] Add reduction spmd rule (PaddlePaddle#54991)

59d5bc7

* add reduction spmd rule for auto parallel

* fix the logic of handling partial

* fix code style

* fix the partial handling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet