Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Semi Auto] Revise spmd rule static mode API #57269

Merged

Conversation

JZ-LIANG
Copy link
Contributor

@JZ-LIANG JZ-LIANG commented Sep 13, 2023

PR types

Function optimization

PR changes

Others

Description

Pcard-70448

Refactor the static mode python api to distinguish input between tensor and [tensor].

Now the python api signature is exact the same as its C++ implement (which is like the format in Op's Python API).

Before:

# matmul rule
results = rule.infer_forward( [x, y], [True, False] ) # need to wrap tensor and attr into list
results = rule.infer_backward( [x, y], [out], [True, False] )

# concate rule
results = rule.infer_forward([2], [v1, v2], [axis]) # ranges, list of tensors, list of attrs
results = rule.infer_backward([2, 1], [v1, v2, out], [axis])

# replicate rule
results = rule.infer_forward([2, 2], [x, y, out1, out2], [])  # placeholder for attrs
results = rule.infer_backward([2, 2], [x, y, out1, out2], [])

After:

# matmul rule
result_dist_attrs = self.rule.infer_forward( x, y, True, False ) # exactly like matmul Python API
result_dist_attrs = self.rule.infer_backward( x, y, out, True, False)

# concate rule
results = rule.infer_forward( [v1, v2], axis) # exactly like concat Python API
results = rule.infer_backward([v1, v2], out, axis) 

# replicate rule
results = rule.infer_forward([x, y], [out1, out2])
results = rule.infer_backward([x, y], [out1, out2])

TODO:

  1. exactly match the Python API (filter the non-relative attributes internally)
  2. supporting more ops using codegen.
  3. replace dist_tensor_spec as tensor(dist_tensor)

@paddle-bot
Copy link

paddle-bot bot commented Sep 13, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot
Copy link

paddle-bot bot commented Sep 13, 2023

✅ This PR's description meets the template requirements!
Please wait for other CI results.

zhiqiu
zhiqiu previously approved these changes Sep 13, 2023
Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

}

static std::pair<std::vector<TensorDistAttr>, std::vector<TensorDistAttr>>
infer_backward(const phi::distributed::SpmdRule &self, const py::args &args) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

py::args& can be const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it work

static void parse_single_pyobject(PyObject *obj,
phi::distributed::InferSpmdContext *ctx,
const size_t arg_pos) {
if (PyList_Check(obj)) { // list inputs, spmd not allow tuple inputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we can support tuple and list both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no tuple argument in Phi API yaml right now, we would support it once there is tuple input defined by Phi.

VLOG(6) << "args indx: [" << arg_pos << "] input one tensor.";
DistTensorSpec in = py::cast<DistTensorSpec>(obj);
VLOG(6) << "DistTensorSpec: " << in.to_string();
ctx->EmplaceBackInput(phi::distributed::DistMetaTensor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can consider unify DistMetaTensor and DistTensorSpec in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is on the list

@JZ-LIANG JZ-LIANG merged commit 6e84042 into PaddlePaddle:develop Sep 19, 2023
Frida-a pushed a commit to Frida-a/Paddle that referenced this pull request Oct 14, 2023
* adapt general spmd rule

* polish details

* add new rules
---------

Co-authored-by: Chen Weihang <chenweihang@baidu.com>
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
* adapt general spmd rule

* polish details

* add new rules
---------

Co-authored-by: Chen Weihang <chenweihang@baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants