[Typing][debug] 临时 PR 用于监测全量类型标注，请勿合入 #65397

megemini · 2024-06-24T03:48:26Z

PR Category

Others

PR Types

Others

Description

临时 PR 用于监测全量类型标注，请勿合入

关联 PR #65008

@SigureMo

paddle-bot · 2024-06-24T03:48:31Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

megemini · 2024-06-24T09:59:46Z

问题 1： `EagerParamBase` 该如何处理

参考如下代码：

            >>> from paddle import LazyGuard
            >>> from paddle.nn import Linear

            >>> with LazyGuard():
            ...     # w and b are initialized lazily and have no memory.
            ...     net = Linear(10, 10)
            ...
            >>> for param in net.parameters():
            ...     # Initialize param and allocate memory explicitly.
            ...     param.initialize()

此时 type checking 错误，因为：

In [7]: type(param)
Out[7]: paddle.base.framework.EagerParamBase

再看 Layer 中 parameters 的一般用法：

    def parameters(self, include_sublayers: bool = True) -> list[Tensor]:

也就是说，with LazyGuard 改变（增加）了 Tensor 的属性～

目前想到两个解决方案：

在 tensor.prototype.pyi 中增加 EagerParamBase 多出来的那几个属性
parameters 返回 list[Tensor] | list[EagerParamBase]

p.s. 后面在这个 PR 把一些问题捡出来～

megemini · 2024-06-24T10:23:24Z

问题2：`[abstract]` 错误如何处理？

发现问题在 python/paddle/distributed/fleet/utils/fs.py 中：

class LocalFS(FS) 继承了 FS 但是没有实现所有抽象方法，如 "cat", "download", "upload" and "upload_dir"

mypy 错误提示：

<string>:2:12: error: Cannot instantiate abstract class "LocalFS" with abstract attributes "cat", "download", "upload" and "upload_dir"  [abstract]

解决方案：

在配置文件中 pyproject.toml 增加 disable_error_code = "abstract"
修改 LocalFS 源代码，实现上述方法

megemini · 2024-06-24T10:52:15Z

问题3：不能在 `@property` 与 `@xxx.setter` 中插入其他方法

以下测试代码发现问题：

                >>> import paddle.distributed.fleet as fleet
                >>> strategy = fleet.DistributedStrategy()
                >>> strategy.dgc = True
                >>> strategy.recompute = True
                >>> strategy.recompute_configs = {"checkpoints": ["x"]}
                >>> strategy.save_to_prototxt("dist_strategy.prototxt")

                >>> strategy.load_from_prototxt("dist_strategy.prototxt")

报错：

<string>:4:1: error: Property "recompute" defined in "DistributedStrategy" is read-only  [misc]

经定位发现，mypy 检查 property 的时候：

from __future__ import annotations

import decorator # type: ignore

from typing import TYPE_CHECKING, Callable, TypeVar
from typing_extensions import ParamSpec

non_auto_func_called = True

_InputT = ParamSpec("_InputT")
_RetT = TypeVar("_RetT")
_RetT1 = TypeVar("_RetT1")
_RetT2 = TypeVar("_RetT2")


def __non_auto_func_called__(
    func: Callable[_InputT, _RetT]
) -> Callable[_InputT, _RetT]:
    def __impl__(*args: _InputT.args, **kwargs: _InputT.kwargs) -> _RetT:
        global non_auto_func_called
        non_auto_func_called = False
        return func(*args, **kwargs)

    return __impl__

def wrap_decorator(
    decorator_func: Callable[[Callable[_InputT, _RetT1]], Callable[_InputT, _RetT2]]
) -> Callable[[Callable[_InputT, _RetT1]], Callable[_InputT, _RetT2]]:
    @decorator.decorator
    def __impl__(
        func: Callable[_InputT, _RetT1], *args: _InputT.args, **kwargs: _InputT.kwargs
    ) -> _RetT2:
        wrapped_func = decorator_func(func)
        return wrapped_func(*args, **kwargs)

    return __impl__


is_strict_auto = wrap_decorator(__non_auto_func_called__)


class A:
    def __init__(self, rec: int) -> None:
        self._rec = rec

    @property
    def recompute(self) -> int:
        return self._rec

    @recompute.setter
    @is_strict_auto
    def recompute(self, rec: int) -> None:
        self._rec = rec

class B:
    def __init__(self, rec: int) -> None:
        self._rec = rec

    @property
    def recompute(self) -> int:
        return self._rec

    def tmp(self):
        """ 不能在 `recompute` 中间插一个其他方法
        test_func_wrap.py:70:6: error: Name "recompute" already defined on line 59  [no-redef]
        test_func_wrap.py:70:6: error: "Callable[[B], int]" has no attribute "setter"  [attr-defined]
        """
        pass

    @recompute.setter
    def recompute(self, rec: int) -> None:
        self._rec = rec

a = A(1)
a.recompute = 3

如上述代码中的 class B ，不能插入 def tmp 方法～

而，python/paddle/distributed/fleet/base/distributed_strategy.py 中的 DistributedStrategy 有很多 property ，与相对应的 @xxx.setter 都是分开的，导致出错，如：

    @property
    def recompute(self):
...
    @recompute.setter
    @is_strict_auto
    def recompute(self, flag):

经测试，如果把两者放到一起，问题消失～

解决方案：

修改源代码中 @property 与 @xxx.setter 的顺序

SigureMo · 2024-06-24T10:56:33Z

在 tensor.prototype.pyi 中增加 EagerParamBase 多出来的那几个属性

是否可以利用「组合」？Protocol 本身就非常契合组合的概念，类似于 Java、TypeScript 的 Interface 和 Rust 的 Trait

class Lazyable(Protocal): # 刚刚的一些独有方法
    def initialize(self): ...

class IrValue(Protocal): # Value 的相关方法
    def is_dense_tensor_type(self): ...

class TensorBase(Protocal):
    def xxx(self): ... # 现有的全部方法

class Tensor(TensorBase, Lazyable, IrValue): ... # 三者组合

SigureMo · 2024-06-24T11:30:30Z

而，python/paddle/distributed/fleet/base/distributed_strategy.py 中的 DistributedStrategy 有很多 property ，与相对应的 @xxx.setter 都是分开的，导致出错，如：

离谱，离大谱……不过这个可以改代码就是了，改完其实更易读些，但还是不得不说，mypy 太拉了……

megemini · 2024-06-24T12:14:08Z

在 tensor.prototype.pyi 中增加 EagerParamBase 多出来的那几个属性

是否可以利用「组合」？Protocol 本身就非常契合组合的概念，类似于 Java、TypeScript 的 Interface 和 Rust 的 Trait
class Lazyable(Protocal): # 刚刚的一些独有方法
    def initialize(self): ...

class IrValue(Protocal): # Value 的相关方法
    def is_dense_tensor_type(self): ...

class TensorBase(Protocal):
    def xxx(self): ... # 现有的全部方法

class Tensor(TensorBase, Lazyable, IrValue): ... # 三者组合

可以～效果应该一样～不过都没法解决：何时暴露 EagerParamBase 独有的方法，这个问题～毕竟这东西是动态绑定的～那我改一下 tensor.prototype.pyi ～

SigureMo · 2024-06-24T12:17:09Z

何时暴露 EagerParamBase 独有的方法

这个没办法，静态类型不可能将所有运行时的奇技淫巧都覆盖到，我们只需要确保需要的方法都能正确提示就好了～

SigureMo · 2024-06-24T17:25:17Z

这个还是蛮重要的，周末在想干点啥的时候，就想本地跑下全量确保问题是收敛的，因为从整体上来看，其他任务是可以让开发者来平稳推进的，但推进过程难免会影响那些监控不到的示例

（虽然我本地跑了下发现挂了就没后续了就是了……）

对于未来的监控，我觉得完善的监控是：

如果修改示例代码，那么对该示例代码跑 mypy 检查
如果修改 API（含类型提示），那么对全量示例代码跑 mypy 检查，当然前提是时间可控，10 min 我觉得是可接受的，因为修改 API 的情况非常少见

不过现阶段的话，我们可以一起推进解决下这里的报错问题～

megemini · 2024-06-25T07:01:24Z

问题4：`dtype` 是否要支持 `float`

参考示例：

import paddle
tensor = paddle.randn([512, 512, 512], "float")

这里是不是根据当时的运行环境决定 float 为 float32、float64 或者其他类型？

目前 DTypeLike 只有 floatXX ～

解决方案：

DTypeLike 增加 float
修改示例代码

SigureMo · 2024-06-25T09:42:18Z

问题4：dtype 是否要支持 float

只有一个地方使用是么？而且并不是测相关 case 的，那我觉得没必要加，修改一下 case 吧，我觉得不应该推荐使用这种语义不明确的用法

… tmp_typing_all

megemini · 2024-06-26T12:23:44Z

问题5：`distributed` 有较多错误示例代码

如：

2024-06-26 01:14:21 --------------------
2024-06-26 01:14:21 >>> Type hints with api paddle.distributed.sharding.save_group_sharded_model:1 start ...
2024-06-26 01:14:21 import paddle
2024-06-26 01:14:21 from paddle.nn import Linear
2024-06-26 01:14:21 from paddle.distributed import fleet
2024-06-26 01:14:21 from paddle.distributed.sharding import group_sharded_parallel, save_group_sharded_model
2024-06-26 01:14:21 fleet.init(is_collective=True)
2024-06-26 01:14:21 group = paddle.distributed.new_group([0, 1])
2024-06-26 01:14:21 model = Linear(1000, 1000)
2024-06-26 01:14:21 clip = paddle.nn.ClipGradByGlobalNorm(clip_norm=1.0)
2024-06-26 01:14:21 optimizer = paddle.optimizer.AdamW(learning_rate=0.001, parameters=model.parameters(), weight_decay=0.00001, grad_clip=clip)
2024-06-26 01:14:21 model, optimizer, scaler = group_sharded_parallel(model, optimizer, "p_g", scaler=scaler)
2024-06-26 01:14:21 img, label = data
2024-06-26 01:14:21 label.stop_gradient = True
2024-06-26 01:14:21 img.stop_gradient = True
2024-06-26 01:14:21 out = model(img)
2024-06-26 01:14:21 loss = paddle.nn.functional.cross_entropy(input=out, label=label)
2024-06-26 01:14:21 loss.backward()
2024-06-26 01:14:21 optimizer.step()
2024-06-26 01:14:21 optimizer.clear_grad()
2024-06-26 01:14:21 save_group_sharded_model(model, optimizer, output=output_dir)
2024-06-26 01:14:21 >>> Results ...
2024-06-26 01:14:21 >>> mypy normal_report is ...
2024-06-26 01:14:21 <string>:10:83: error: Cannot determine type of "scaler"  [has-type]
2024-06-26 01:14:21 <string>:11:14: error: Name "data" is not defined  [name-defined]
2024-06-26 01:14:21 <string>:19:1: error: "save_group_sharded_model" gets multiple values for keyword argument "output"  [misc]
2024-06-26 01:14:21 <string>:19:51: error: Name "output_dir" is not defined  [name-defined]
2024-06-26 01:14:21 Found 4 errors in 1 file (checked 1 source file)

由于没有环境验证示例代码（CI 上面好像也没测试），这里较多错误，该如何处理？

另外，问题2 是不是漏了？🫠

SigureMo · 2024-06-26T12:26:05Z

由于没有环境验证示例代码（CI 上面好像也没测试），这里较多错误，该如何处理？

能修就修，不能修就这样吧，或者整个黑名单机制，部分 API 先不管吧

SigureMo · 2024-06-26T12:27:53Z

问题2：[abstract] 错误如何处理？

喔喔，没注意这里需要决策

在配置文件中 pyproject.toml 增加 disable_error_code = "abstract"

支持 file level 么？只是文件级别禁用我觉得是比较合适的

… tmp_typing_all

megemini · 2024-06-26T13:09:45Z

支持 file level 么？只是文件级别禁用我觉得是比较合适的

#65496 在 fs.py 中加了 ignore ～本地测试 OK ～

… tmp_typing_all

megemini · 2024-06-28T06:53:37Z

问题6：`weight_attr` 是否标注为多个类型？

以下示例：

import paddle
import paddle.nn as nn
linear = nn.Linear(2, 4, weight_attr=nn.initializer.KaimingNormal())
data = paddle.rand([2, 1, 2], dtype='float32')
res = linear(data)
print(res)

nn.Linear(2, 4, weight_attr=nn.initializer.KaimingNormal()) 的 weight_attr 实际上可以接收 Initializer 或其他多个类型，如：

# 方法1,使用 Initializer
import paddle
import paddle.nn as nn
linear = nn.Linear(2, 4, weight_attr=nn.initializer.KaimingNormal())
data = paddle.rand([2, 1, 2], dtype='float32')
res = linear(data)
print(res)

# 方法2，使用 ParamAttr
import paddle
import paddle.nn as nn
from paddle import ParamAttr
weight_attr = ParamAttr(initializer=nn.initializer.KaimingNormal())
linear = nn.Linear(2, 4, weight_attr=weight_attr)
data = paddle.rand([2, 1, 2], dtype='float32')
res = linear(data)
print(res)

# 方法3,使用 str
import paddle
import paddle.nn as nn
from paddle import ParamAttr
linear = nn.Linear(2, 4, weight_attr='weight')
data = paddle.rand([2, 1, 2], dtype='float32')
res = linear(data)
print(res)

但，Layer 相关的 weight_attr 标注都只是写的 ParamAttr ～

解决方案：

重新标注 Layer 相关的 weight_attr 为 Union
统一在示例中改为上述 方法2 的形式，即，先转换为 ParamAttr 后传入

… tmp_typing_all

[tmp] change

478455d

megemini requested review from SigureMo and gouzil as code owners June 24, 2024 03:48

paddle-bot bot added the contributor External developers label Jun 24, 2024

SigureMo mentioned this pull request Jun 24, 2024

[Typing] Fix undefined names in example code #65429

Merged

megemini mentioned this pull request Jun 25, 2024

[Typing] 修改一些错误的类型标注以及示例代码 #65452

Merged

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

6b54bf8

… tmp_typing_all

This was referenced Jun 25, 2024

[Typing] 修改一些示例代码中的类型错误 part 1 #65461

Merged

[Typing] 修改一些示例代码中的类型错误 part 2 #65463

Merged

megemini mentioned this pull request Jun 26, 2024

[Typing] 修改 core.pyi 及一些示例代码中的类型错误 #65496

Merged

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

78fb419

… tmp_typing_all

megemini added 3 commits June 27, 2024 17:56

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2e33654

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

b352af3

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

d421e40

… tmp_typing_all

megemini added 3 commits July 12, 2024 20:43

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

319796b

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

cf8f000

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

9d92a90

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

e3b5849

… tmp_typing_all

This was referenced Jul 29, 2024

[Typing] 修复示例中遗漏的类型标注 #66752

Closed

[Typing] 修复标注类型 ReduceOp to _ReduceOp #66884

Merged

megemini added 4 commits July 31, 2024 22:14

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

ac98069

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

d6e0ce8

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

14c67fb

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

30ee36d

… tmp_typing_all

This was referenced Aug 10, 2024

[Typing] 修复 dynamic_decode 以及示例中的类型标注错误 #67295

Merged

[Typing] 修复 exe.run 以及示例中的类型标注错误 #67302

Merged

megemini added 4 commits August 12, 2024 17:29

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

1a9227f

… tmp_typing_all

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

30f3a83

… tmp_typing_all

[Update] mypy version

020b764

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

609ccac

… tmp_typing_all

megemini mentioned this pull request Aug 18, 2024

[Typing] 修复 shard_optimizer 以及示例中的类型标注错误 #67529

Merged

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

c25f0c3

… tmp_typing_all

megemini mentioned this pull request Aug 21, 2024

[Typing] 修复示例中的类型标注错误 #67618

Merged

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

4cffc27

… tmp_typing_all

megemini mentioned this pull request Aug 22, 2024

为 Paddle 框架 API 添加类型提示（Type Hints）Tracking Issue #63597

Closed

16 tasks

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

0387f36

… tmp_typing_all

megemini changed the title ~~[Typing all][debug] 临时 PR 用于监测全量类型标注，请勿合入~~ [Typing][debug] 临时 PR 用于监测全量类型标注，请勿合入 Aug 26, 2024

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

0c80608

… tmp_typing_all

luotao1 closed this Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Typing][debug] 临时 PR 用于监测全量类型标注，请勿合入 #65397

[Typing][debug] 临时 PR 用于监测全量类型标注，请勿合入 #65397

megemini commented Jun 24, 2024

paddle-bot bot commented Jun 24, 2024

megemini commented Jun 24, 2024

megemini commented Jun 24, 2024

megemini commented Jun 24, 2024

SigureMo commented Jun 24, 2024

SigureMo commented Jun 24, 2024

megemini commented Jun 24, 2024

SigureMo commented Jun 24, 2024

SigureMo commented Jun 24, 2024

megemini commented Jun 25, 2024

SigureMo commented Jun 25, 2024

megemini commented Jun 26, 2024

SigureMo commented Jun 26, 2024

SigureMo commented Jun 26, 2024

megemini commented Jun 26, 2024

megemini commented Jun 28, 2024

[Typing][debug] 临时 PR 用于监测全量类型标注，请勿合入 #65397

[Typing][debug] 临时 PR 用于监测全量类型标注，请勿合入 #65397

Conversation

megemini commented Jun 24, 2024

PR Category

PR Types

Description

paddle-bot bot commented Jun 24, 2024

megemini commented Jun 24, 2024

问题 1： EagerParamBase 该如何处理

megemini commented Jun 24, 2024

问题2：[abstract] 错误如何处理？

megemini commented Jun 24, 2024

问题3：不能在 @property 与 @xxx.setter 中插入其他方法

SigureMo commented Jun 24, 2024

SigureMo commented Jun 24, 2024

megemini commented Jun 24, 2024

SigureMo commented Jun 24, 2024

SigureMo commented Jun 24, 2024

megemini commented Jun 25, 2024

问题4：dtype 是否要支持 float

SigureMo commented Jun 25, 2024

megemini commented Jun 26, 2024

问题5：distributed 有较多错误示例代码

SigureMo commented Jun 26, 2024

SigureMo commented Jun 26, 2024

megemini commented Jun 26, 2024

megemini commented Jun 28, 2024

问题6：weight_attr 是否标注为多个类型？

问题 1： `EagerParamBase` 该如何处理

问题2：`[abstract]` 错误如何处理？

问题3：不能在 `@property` 与 `@xxx.setter` 中插入其他方法

问题4：`dtype` 是否要支持 `float`

问题5：`distributed` 有较多错误示例代码

问题6：`weight_attr` 是否标注为多个类型？