[Bug Fix] fix the bug in constrcut Tensor with diference place #75017
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Category
User Experience
PR Types
Bug fixes
Description
问题描述:
当前paddle.Tensor的构造函数中新增了4种构造方式,他们都是依赖于paddle.empty或paddle.tensor去实现的。这两个API会自己检测当前的设备从而去确定设置到哪个place上。但新增的paddle.Tensor()构造方法其实都是希望在不指定device时,创建在CPU上的。
在实现过程中,其实只有在调original_init()的时候设置了device,而在用paddle.empty或paddle.tensor创建Tensor时没有设置device,这就导致了GPU环境下paddle.Tensor()执行时,会先创建GPU的paddle.empty([0]),然后再通过original_init()转化成cpu的Tensor存在拷贝开销。
此外在python/paddle/incubate/multiprocessing/reductions.py中,

这个位置会走一步先empty构造成GPU Tensor再用original_init构造成CPU Tensor的操作。
如果这个函数在multiprocess创建的子进程中调用,则会出现cuda环境初始化失败的现象,进而导致Tensor创建失败。

解决方案:
对于新增的几个构造方法,除了给original_init添加place参数外,还给paddle.empty()和paddle.tensor()里添加device参数,确保创建Tensor的过程中,place保持一致。
pcard-71500