Skip to content

Conversation

@Enigmatisms
Copy link
Contributor

@Enigmatisms Enigmatisms commented Nov 2, 2025

PR Category

User Experience

PR Types

New features

Description

新增了 PyTorch 对齐的 paddle.compat.nn.Linear,调用 #76144compat.nn.functional.linear。除了参数用法、数学意义的对齐之外,初始化方法也进行了对齐(kaiming norm 初始化 weight,uniform 初始化 bias)。

PaConvert 已全部通过。

TODO:

  • Initializer 修改(place)
  • 复用 _calculate_fan_in_and_fan_out 函数(对应API尚未PR)

Pcard-89620

@paddle-bot
Copy link

paddle-bot bot commented Nov 2, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 97.29730% with 1 line in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@84d14f8). Learn more about missing BASE report.

Files with missing lines Patch % Lines
python/paddle/compat/nn/__init__.py 97.22% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #76169   +/-   ##
==========================================
  Coverage           ?   97.29%           
==========================================
  Files              ?        2           
  Lines              ?       37           
  Branches           ?        0           
==========================================
  Hits               ?       36           
  Misses             ?        1           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

# KaimingUniform initializer should be more flexible: user should be able to specify place
expected_place = paddle.base.framework._current_expected_place()
original_place = self.weight.place
nn.init.kaiming_uniform_(self.weight, a=sqrt(5))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是因为init.kaiming_uniform_会改变place吗,这个API也是最近才加的,如果有bug可以直接改。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对。init.kaiming_uniform_ 不管输入的 Tensor 是什么,直接全部变到 _current_expected_place() 的输出上,如果这个接口可以保留一个 place 的输入,那就不需要转换了。当然这个通常来说,用默认 place 计算(比如GPU设备输入的device是GPU,CPU设备输入的device是CPU)是没什么问题的。

if place_mismatch and in_dynamic_mode():
self.weight = self.weight.to(original_place)
if self.bias is not None:
# nn.init._calculate_fan_in_and_fan_out(self.weight) for 2D array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nn.init._calculate_fan_in_and_fan_out实现是不是不对,这个API也是最近才加的,如果有bug可以直接改。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nn.init._calculate_fan_in_and_fan_out 是 PyTorch 的接口,Paddle没有。这里是用了一种等效的写法,Linear不需要这个 API(因为2D情况下比较简单)。

)
self.in_features = in_features
self.out_features = out_features
self.weight = self.create_parameter(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还有一种初始方式是直接在create_parameter时设置与torch能对齐的weight_attr/bias_attr,看哪种好写吧。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果只就初始化而言,weight_attr/bias_attr 给 Initializer 的写法更简单,且不需要在动态图时手动转换。但是 reset_parameters 这个函数是 torch.nn.Linear 的一个成员函数,目前也有很多模型在 inplace 重设参数时会调用这个函数,所以我直接复用了 reset_parameters 而没有给 Initializer。reset_initializer 函数中 kaiming_uniform_ 的问题仍然存在。这个初始化的优化我后续补充一个PR简化一下,本PR就先提供对齐的功能。

Copy link
Contributor

@zhwesky2010 zhwesky2010 Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果只就初始化而言,weight_attr/bias_attr 给 Initializer 的写法更简单,且不需要在动态图时手动转换。但是 reset_parameters 这个函数是 torch.nn.Linear 的一个成员函数,目前也有很多模型在 inplace 重设参数时会调用这个函数,所以我直接复用了 reset_parameters 而没有给 Initializer。reset_initializer 函数中 kaiming_uniform_ 的问题仍然存在。这个初始化的优化我后续补充一个PR简化一下,本PR就先提供对齐的功能。

使用reset_parameters也可以,后面看看这两个点吧:

  1. torch.nn.init不会改变weight的place,我们这个实现看起来有冗余操作,减少这些额外拷贝比较好。
  2. _calculate_fan_in_and_fan_out本期也计划新增,是_compute_fans的别名,这里建议复用起来。

zhwesky2010

This comment was marked as duplicate.

Copy link
Contributor

@zhwesky2010 zhwesky2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个先合入吧。下个PR再看看。

@Enigmatisms Enigmatisms merged commit b9e19ab into PaddlePaddle:develop Nov 3, 2025
111 of 117 checks passed
@Enigmatisms Enigmatisms deleted the nn_linear branch November 3, 2025 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants