Skip to content

Conversation

@Enigmatisms
Copy link
Contributor

@Enigmatisms Enigmatisms commented Nov 3, 2025

PR Category

Operator Mechanism

PR Types

Improvements

Description

优化了

中引入的 compat.nn.Linear 的初始化和 reset 函数。reset 函数中,nn.Initializer.Uniformnn.Initializer.KaimingUniform 的行为都被修改:

  • nn.initializer.XXX 被直接实例化然后用 forward 形式初始化参数时,原始的行为如下代码块所示:
>>> a = paddle.nn.Linear(3, 4).to('cpu')
>>> a
Linear(in_features=3, out_features=4, dtype=None)
>>> a.weight
Parameter containing:
Tensor(shape=[3, 4], dtype=float32, place=Place(cpu), stop_gradient=False,
       [[-0.45837218, -0.32479179, -0.47977936,  0.04010978],
        [-0.36938468,  0.12928823,  0.82120913, -0.60326314],
        [ 0.89723587, -0.30610302,  0.82338744, -0.04669143]])
>>> init = paddle.nn.initializer.XavierNormal()
>>> init(a.weight)
>>> a.weight
Parameter containing:
Tensor(shape=[3, 4], dtype=float32, place=Place(gpu:0), stop_gradient=False,
       [[ 0.06874870, -0.84757495,  1.06253850, -0.43288031],
        [ 0.54353887, -0.06623430, -0.98724484, -0.62573409],
        [ 0.52956545,  0.14853251, -0.45487842, -0.41272211]])

可以看到,place 从 cpu 变成了 gpu:0。这是因为 initializer 在动态图执行时,传入的 place 是 _current_expected_place(),而不是输入 tensor 的 place。而如果强行将所有 initializer 对应位置修改为 var.place 也会引起问题:如果 initializer 实例传入 ParamAttr 中,ParamAttr 辅助创建 parameters 时,tensor 的 place 是 undefined:0,此时必须使用 _current_expected_place()

本 PR 只修改了与 compat.nn.Linear 有关的 initializer(kaiming_uniform 以及 uniform),其他 initializer 行为未变(个人觉得原始行为是不合理的,很显然,如果确实不合理,对应的initializer都需要修改,则需要一个新的PR完成对所有的 initializer 行为的‘修正’)。基于此,修正了 compat.nn.Linear 的初始化行为:不再手动进行 place 一致性检查以及 to 操作。

Pcard-89620

@paddle-bot
Copy link

paddle-bot bot commented Nov 3, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@zhwesky2010 zhwesky2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@94faddc). Learn more about missing BASE report.

Additional details and impacted files
@@             Coverage Diff             @@
##             develop    #76196   +/-   ##
===========================================
  Coverage           ?   100.00%           
===========================================
  Files              ?         1           
  Lines              ?         2           
  Branches           ?         0           
===========================================
  Hits               ?         2           
  Misses             ?         0           
  Partials           ?         0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Enigmatisms Enigmatisms merged commit b6521c6 into PaddlePaddle:develop Nov 4, 2025
74 of 76 checks passed
@Enigmatisms Enigmatisms deleted the better_compat_linear branch November 4, 2025 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants