Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoParallel] Add paddle.distributed.dtensor_from_fn api #56565

Merged
merged 22 commits into from
Sep 12, 2023

Conversation

yangxiaoyu14
Copy link
Contributor

PR types

New features

PR changes

APIs

Description

Pcard-73145

[AutoParallel] Add paddle.distributed.dtensor_from_fn api

@CLAassistant
Copy link

CLAassistant commented Aug 23, 2023

CLA assistant check
All committers have signed the CLA.

… add_dtensor_from_fn_api

delete /Paddle/build/test/auto_parallel/test_dist_tensor.py
@paddle-bot paddle-bot bot added the contributor External developers label Aug 23, 2023
@@ -24,7 +24,7 @@
class DistAttr(core.TensorDistAttr):
"""
DistAttr specifies how tensors are distributed or sliced on ProcessMesh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多余的空格缩进可以移除

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thx

fn, dist_attr, *args, **kwargs
):
"""
Construct a Distributed Tensor from a function of arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是强调下是paddle api funciton,不是任意的function都可以

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thx

Construct a Distributed Tensor from a function of arguments.

Args:
fn (callable): A callable function that takes arguments of Distributed Tensor and returns tensor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这句解释不太准确,翻译过来是:“一个可调用的函数,它接受分布式张量作为参数,并返回张量”,应该不是接收分布式张量作为参数吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改为:fn (callable): A paddle api function that takes arguments of *args, **kwargs and returns tensor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里修改了吗?

import paddle
import paddle.distribute as dist

def generate_tensor():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个函数有什么作用?是不是应该移除

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thx

class TestDistributedTensor(unittest.TestCase):
def test_dtensor_from_fn(self):
# Define a function for generating a tensor
def generate_tensor_ones():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这几个函数好像都没有用到,可以移除

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thx

result_random = dist.dtensor_from_fn(paddle.rand, dist_attr=dist_attr, shape=[2, 3])
self.assertIsInstance(result_random, paddle.Tensor)
self.assertEqual(result_random.shape, [2, 3])
self.assertEqual(result_random.dist_attr, dist_attr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是需要加几个异常case,测试一下报错的情况

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是也测试下静态图?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thx

@luotao1 luotao1 removed the contributor External developers label Aug 24, 2023
Comment on lines 152 to 162
import paddle
import paddle.distribute as dist

# Create a distributed attribute
mesh = dist.ProcessMesh([[2, 4, 5], [0, 1, 3]], dim_names=["x", "y"])
dist_attr = dist.DistAttr(mesh=mesh, sharding_specs=['x', 'y'])

# Call the function dtensor_from_fn with dist_attr parameter
d_tensor = dist.dtensor_from_fn(paddle.ones, dist_attr=dist_attr, shape=[2, 3])

print(d_tensor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考学习后重新提交,感谢


Args:
fn (callable): A paddle api function that takes arguments of *args, **kwargs and returns tensor.
dist_attr(paddle.distributed.DistAttr): Specify how tensors are distributed or sliced on ProcessMesh.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn和(callable之间有空格,dist_attr和括号之间也建议增加空格

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Args:
fn (callable): A paddle api function that takes arguments of *args, **kwargs and returns tensor.
dist_attr(paddle.distributed.DistAttr): Specify how tensors are distributed or sliced on ProcessMesh.
*args: A list of arguments to be passed to the ``fn`` function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议也增加括号统一格式

*args (tuple): 
**kwargs (dict): 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里两个参数一个是tuple,一个是dict,不是list,建议区分一下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


.. code-block:: python

>>> import paddle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里代码是不是需要缩进4个空格

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

mesh = dist.ProcessMesh([[2, 4, 5], [0, 1, 3]], dim_names=["x", "y"])
dist_attr = dist.DistAttr(mesh=mesh, sharding_specs=['x', 'y'])

# Test with generate_tensor_ones()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个注释对应的函数已经没有了,注释建议修改一下,和下面的代码关联不明确,例如改为Test with paddle.ones

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

决定删除了不对应的注释

@@ -53,6 +53,56 @@ def test_dist_tensor_creation(self):
self.assertEqual(dist_tensor_with_tensor.dist_attr, dist_attr)


class TestDistributedTensor(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

测试类名要和测试内容对应,改为TestDistTensorFromFn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -53,6 +53,56 @@ def test_dist_tensor_creation(self):
self.assertEqual(dist_tensor_with_tensor.dist_attr, dist_attr)


class TestDistributedTensor(unittest.TestCase):
def test_dtensor_from_fn(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前还是没有测试静态图,需要再补充一下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

补充了静态图enable_static()状态下的测试

Comment on lines 159 to 170
Examples:

.. code-block:: python

>>> import paddle
>>> import paddle.distribute as dist
>>> # Create a distributed attribute
>>> mesh = dist.ProcessMesh([0, 1], dim_names=["x"])
>>> dist_attr = dist.DistAttr(mesh=mesh, sharding_specs=[None])
>>> # Call the function dtensor_from_fn with dist_attr parameter
>>> d_tensor = dist.dtensor_from_fn(paddle.ones, dist_attr=dist_attr, shape=[1])
>>> print(d_tensor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • code-block部分及以下整体往右加4个缩进
  • 同时看ci检查,好像没有 paddle.distribute,应该改成 paddle.distributed 吧?务必注意代码示例要能跑通
Suggested change
Examples:
.. code-block:: python
>>> import paddle
>>> import paddle.distribute as dist
>>> # Create a distributed attribute
>>> mesh = dist.ProcessMesh([0, 1], dim_names=["x"])
>>> dist_attr = dist.DistAttr(mesh=mesh, sharding_specs=[None])
>>> # Call the function dtensor_from_fn with dist_attr parameter
>>> d_tensor = dist.dtensor_from_fn(paddle.ones, dist_attr=dist_attr, shape=[1])
>>> print(d_tensor)
Examples:
.. code-block:: python
>>> import paddle
>>> import paddle.distributed as dist
>>> # Create a distributed attribute
>>> mesh = dist.ProcessMesh([0, 1], dim_names=["x"])
>>> dist_attr = dist.DistAttr(mesh=mesh, sharding_specs=[None])
>>> # Call the function dtensor_from_fn with dist_attr parameter
>>> d_tensor = dist.dtensor_from_fn(paddle.ones, dist_attr=dist_attr, shape=[1])
>>> print(d_tensor)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thx

self.assertIsInstance(result, paddle.fluid.framework.Variable)
self.assertEqual(result.shape, (1,))

# Test with generate_tensor_zeros()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的注释还需要删除一下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.assertIsInstance(result, paddle.fluid.framework.Variable)
self.assertEqual(result.shape, (1,))

# Test with generate_tensor_random()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的注释的也是

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

chenwhql
chenwhql previously approved these changes Sep 6, 2023
Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Construct a Distributed Tensor from a function of arguments.

Args:
fn (callable): A callable function that takes arguments of Distributed Tensor and returns tensor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里修改了吗?

Examples:

.. code-block:: python
>>> import paddle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

示例代码一般不加 >>> 吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

咨询过钟凯,目前的新要求是要加

def run_dtensor_from_fn(self):
# Create a distributed attribute
mesh = dist.ProcessMesh([0, 1], dim_names=["x"])
dist_attr = dist.DistAttr(mesh=mesh, sharding_specs=[None])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是增加测试一个sharding_specs不为None的版本?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里本来是有sharding_specs不为None的版本的,但是后来雨芮建议改成None,我就改了

Comment on lines 161 to 162
.. code-block:: python
>>> import paddle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. code-block:: python
>>> import paddle
.. code-block:: python
>>> import paddle

code-block 下加个空行,否则无法正常预览代码块

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thx

jeff41404
jeff41404 previously approved these changes Sep 7, 2023
Copy link
Contributor

@jeff41404 jeff41404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for API

sunzhongkai588
sunzhongkai588 previously approved these changes Sep 11, 2023
Copy link
Contributor

@sunzhongkai588 sunzhongkai588 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for docs

…kward.parsed.yaml and paddle/fluid/ir/dialect/paddle_dialect/ir/generated/pd_ops.parsed.yaml
Copy link
Contributor

@sunzhongkai588 sunzhongkai588 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for docs

Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chenwhql chenwhql merged commit 85be34f into PaddlePaddle:develop Sep 12, 2023
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
…le#56565)

* def dtensor_from_fn first edition

* dtensor_from_fn first edition

* Delete file /home/Paddle/build/test/auto_parallel/test_dist_tensor.py

* polish code format

* fix sample code formatting issues

* change sample codes ' >>>' to '>>> '

* Add static image single measurement

* modify the Indent of Sample Code

* complete the sample code modification according to ZhongKai's suggestion

* modify according to the review

* change fluid.Variable to static.Variable

* modify according to zhongkai's review

* According to Yifan's suggestion, pull the latest code to resolve conflicts

* remove paddle/fluid/ir/dialect/paddle_dialect/ir/generated/pd_ops_backward.parsed.yaml and paddle/fluid/ir/dialect/paddle_dialect/ir/generated/pd_ops.parsed.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants