Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Typing][A-16,A-19] Add type annotations for base Layer and containers #65190

Merged
merged 7 commits into from
Jun 17, 2024

Conversation

SigureMo
Copy link
Member

PR Category

User Experience

PR Types

Improvements

Description

为 Layer 基类和相关容器类添加类型提示

Related links

PCard-66972

Copy link

paddle-bot bot commented Jun 14, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@SigureMo
Copy link
Member Author

https://xly.bce.baidu.com/paddlepaddle/paddle/newipipe/detail/10889916/job/26539504

@megemini 这个 PR 也有同样的段错误问题,是不是最近一次多进程执行更新导致的?

@SigureMo
Copy link
Member Author

@megemini 还有最新的 static-check 流水线报错我怎么看不懂 😂 报错在哪里呢? https://xly.bce.baidu.com/paddlepaddle/paddle/newipipe/detail/10890087/job/26539763

@megemini
Copy link
Contributor

megemini commented Jun 15, 2024

@SigureMo

不是 type 的问题,是 doctest 示例代码错了 ~

2024-06-15 12:17:48 DOCTEST RESULT
2024-06-15 12:17:48 * FAILURE: <modpath?>::paddle.DataParallel.to:1:0
2024-06-15 12:17:48 * REASON: GotWantException
2024-06-15 12:17:48 DOCTEST DEBUG INFO
2024-06-15 12:17:48   XDoc "<modpath?>::paddle.DataParallel.to:1:0", line 32 <- wrt doctest
2024-06-15 12:17:48   File "<fpath?>", line 32, <- wrt source file
2024-06-15 12:17:48 DOCTEST PART BREAKDOWN
2024-06-15 12:17:48 Passed Parts:
2024-06-15 12:17:48      1 >>> import paddle
2024-06-15 12:17:48      2 >>> paddle.seed(2023)
2024-06-15 12:17:48      4 >>> linear=paddle.nn.Linear(2, 2)
2024-06-15 12:17:48      5 >>> linear.weight
2024-06-15 12:17:48      6 >>> print(linear.weight)
2024-06-15 12:17:48       Parameter containing:
2024-06-15 12:17:48       Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=False,
2024-06-15 12:17:48              [[ 0.89611185,  0.04935038],
2024-06-15 12:17:48               [-0.58883440,  0.99266374]])
2024-06-15 12:17:48     12 >>> linear.to(dtype='float64')
2024-06-15 12:17:48     13 >>> linear.weight
2024-06-15 12:17:48     14 >>> print(linear.weight)
2024-06-15 12:17:48       Parameter containing:
2024-06-15 12:17:48       Tensor(shape=[2, 2], dtype=float64, place=Place(cpu), stop_gradient=False,
2024-06-15 12:17:48              [[ 0.89611185,  0.04935038],
2024-06-15 12:17:48               [-0.58883440,  0.99266374]])
2024-06-15 12:17:48     20 >>> linear.to(device='cpu')
2024-06-15 12:17:48     21 >>> linear.weight
2024-06-15 12:17:48     22 >>> print(linear.weight)
2024-06-15 12:17:48       Parameter containing:
2024-06-15 12:17:48       Tensor(shape=[2, 2], dtype=float64, place=Place(cpu), stop_gradient=False,
2024-06-15 12:17:48              [[ 0.89611185,  0.04935038],
2024-06-15 12:17:48               [-0.58883440,  0.99266374]])
2024-06-15 12:17:48     28 >>> # xdoctest: +REQUIRES(env:GPU)
2024-06-15 12:17:48     29 >>> linear.to(device=paddle.CUDAPinnedPlace(), blocking=False)
2024-06-15 12:17:48     30 >>> linear.weight
2024-06-15 12:17:48 Failed Part:
2024-06-15 12:17:48     31 >>> print(linear.weight)
2024-06-15 12:17:48       Parameter containing:
2024-06-15 12:17:48       Tensor(shape=[2, 2], dtype=float64, place=Place(gpu_pinned), stop_gradient=False,
2024-06-15 12:17:48              [[ 0.89611185,  0.04935038],
2024-06-15 12:17:48               [-0.58883440,  0.99266374]])
2024-06-15 12:17:48 DOCTEST TRACEBACK
2024-06-15 12:17:48 Expected:
2024-06-15 12:17:48     Tensor(shape=[2, 2], dtype=float64, place=Place(gpu_pinned), stop_gradient=False,
2024-06-15 12:17:48     [[ 0.89611185,  0.04935038],
2024-06-15 12:17:48      [-0.58883440,  0.99266374]])
2024-06-15 12:17:48 Got:
2024-06-15 12:17:48     Parameter containing:
2024-06-15 12:17:48     Tensor(shape=[2, 2], dtype=float64, place=Place(gpu_pinned), stop_gradient=False,
2024-06-15 12:17:48            [[ 0.89611185,  0.04935038],
2024-06-15 12:17:48             [-0.58883440,  0.99266374]])
2024-06-15 12:17:48     
2024-06-15 12:17:48 Repr Difference:
2024-06-15 12:17:48     got  = 'Parameter containing:\nTensor(shape=[2, 2], dtype=float64, place=Place(gpu_pinned), stop_gradient=False,\n       [[ 0.89611185,  0.04935038],\n        [-0.58883440,  0.99266374]])'
2024-06-15 12:17:48     want = 'Tensor(shape=[2, 2], dtype=float64, place=Place(gpu_pinned), stop_gradient=False,\n[[ 0.89611185,  0.04935038],\n [-0.58883440,  0.99266374]])'
2024-06-15 12:17:48 DOCTEST REPRODUCTION
2024-06-15 12:17:48 CommandLine:
2024-06-15 12:17:48     python -m xdoctest <modpath?> paddle.DataParallel.to:1:0

要先改改示例代码 ~

这里的段错误还不确定是哪里导致的 ~ CI 里面日志顺序有时候会乱的,我在本地没有遇到 paddle.clone type checking 的时候报错:

(venv38dev)  shun@shun-B660M-Pro-RS  ~/Documents/Projects/paddle/megemini/Paddle   hack6_8  python tools/type_checking.py paddle.clone
----------------Codeblock Type Checking Start--------------------
>>> Get docstring from api ...
API_PR is diff from API_DEV: dict_keys(['paddle.clone'])
Total api: 1
>>> Running type checker ...
>>> Print summary ...
----------------Check results--------------------
----------------Check results--------------------
>>> Type checking is successful!
>>> Type checking is successful!
----------------End of the Check--------------------
----------------End of the Check--------------------

目前 CI 对于 doctest 和 type checking 的逻辑是这样的:

  • doctest
  • type checking
  • doctest 打印 summary ,如有错误,则退出
  • type checking 打印 summary ,如有错误,则退出

因此,这里的日志结构是:

  • doctest
  • type checking
  • doctest 打印 summary ,有错误,退出

因此就没有最后打印 type checking 的 summary 的部分 ~

至于这样设计的原因,是因为,doctest 和 type checking 在检查的时候都会打印日志,因此,需要把 summary 统一放到最后,不然,type checking 的检查会冲掉 doctest 的日志 ~ 并且,由于目前 type checking 没有作为默认动作,所以,没有把两个 summary 放到一起处理 ~

也就是说,这个日志里面,只少了 type checking 的 summary 部分,具体的检查信息还是全的 ~ type checking 应该也通过了 >>> Type checking is successful!

因此,我的建议是,先把示例代码改好,过了 doctest 再看有没有问题 ~ 这里有可能抛段错误的地方是 parallel ,因为,这里用到了 from multiprocessing import Manager, Process ,也就进程嵌套 ~ 可以参考 https://www.paddlepaddle.org.cn/documentation/docs/zh/dev_guides/style_guide_and_references/code_example_writing_specification_cn.html#solo

另外,我觉得段错误不是 type checking 抛出来,mypy 不会运行示例代码,抛段错误的可能性很低 ~ 而且,之前 --full-test 的时候都没有遇到过,这里只有这几个 api ,不太可能出问题 ... ...

@SigureMo
Copy link
Member Author

这里有可能抛段错误的地方是 parallel

这个是从哪里得到的?这个 PR API change 会当成 DataParallel 我理解,但 #65082 并没有相关的内容

@SigureMo
Copy link
Member Author

另外,我觉得段错误不是 type checking 抛出来,mypy 不会运行示例代码,抛段错误的可能性很低 ~

我知道,但事实是我在开发机直接运行 type_checking 会有段错误

@megemini
Copy link
Contributor

这个是从哪里得到的?这个 PR API change 会当成 DataParallel 我理解,但 #65082 并没有相关的内容
我知道,但事实是我在开发机直接运行 type_checking 会有段错误

嗯,我先记下来 ~ 如果 python tools/type_checking.py paddle.clone 这样都抛段错误,应该也不至于是爆内存 ... ...

@SigureMo
Copy link
Member Author

我在本地没有遇到 paddle.clone type checking 的时候报错:

多跑几次呢?并不是稳定挂,但有 50% 概率以上会挂,而且我改成 #64991 修改前的 ProcessPoolExecutor 也不会有问题

Copy link
Contributor

@Aurelius84 Aurelius84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SigureMo SigureMo merged commit 4ca9b7b into PaddlePaddle:develop Jun 17, 2024
32 of 33 checks passed
@SigureMo SigureMo deleted the typing/layer-and-container branch June 17, 2024 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants