Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【PIR OpTest Fix No.28】 fix test_fused_adam_op #62770

Merged
merged 4 commits into from
Mar 20, 2024

Conversation

Eddie-Wang1120
Copy link
Contributor

PR types

Others

PR changes

Others

Description

PIR Op单测修复
修复单测 test_fused_adam_op
修复后打开FLAGS_enable_pir_in_executor单测是否通过:是

保留inplace报编译时会报错,去掉后编译通过且单侧通过

Copy link

paddle-bot bot commented Mar 16, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Mar 16, 2024
@luotao1 luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Mar 18, 2024
Copy link
Contributor

@kangguangli kangguangli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以贴下修改前的报错吗?在paddle/fluid/pir/dialect/operator/ir/ops.yaml下的定义是复制自 paddle/phi/api/yaml/legacy_ops.yaml,前者是PIR算子的定义,后者是动态图算子的定义。我们的期望是前者要和后者保持一致,所以这里不能这样修改,需要先明确下是什么样的问题导致跑不通,这个问题可能并不是算子定义的不对。

@Eddie-Wang1120
Copy link
Contributor Author

可以贴下修改前的报错吗?在paddle/fluid/pir/dialect/operator/ir/ops.yaml下的定义是复制自 paddle/phi/api/yaml/legacy_ops.yaml,前者是PIR算子的定义,后者是动态图算子的定义。我们的期望是前者要和后者保持一致,所以这里不能这样修改,需要先明确下是什么样的问题导致跑不通,这个问题可能并不是算子定义的不对。

如最新的更改我在ops.yaml和legacy_ops.yaml都新增新的fused_adam定义,但编译时报错:
1710764629447
这是一个redefination错误,我之前的解决方法是只保留新的,想请教一下同时保留新旧定义的方法是什么 @kangguangli

Copy link
Contributor

@kangguangli kangguangli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这样改应该就行了。关于之前为什么会重定义,原因可能与legacy.yaml里同时定义了fused_adam和fused_adam_有关,但是这两个的名字也不冲突。一个猜测是fused_adam定义了inplace,于是生成的API名是fused_adam_导致冲突,具体这里麻烦 @zyfncg 帮忙解释下。

@@ -1254,6 +1254,15 @@
data_type : float
support_tensor : true

- op : fused_adam
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- op : fused_adam
- op : fused_adam_(fused_adam)

这样就会把旧IR的fused_adam算子映射到PIR的fused_adam_

Comment on lines 563 to 565
- op : fused_adam
args : (Tensor[] params, Tensor[] grads, Tensor learning_rate, Tensor[] moments1, Tensor[] moments2, Tensor[] beta1_pows, Tensor[] beta2_pows, Tensor[] master_params, Tensor skip_update, Scalar beta1, Scalar beta2, Scalar epsilon, int chunk_size, float weight_decay, bool use_adamw, bool multi_precision, bool use_global_beta_pow)
output : Tensor[](params_out){params.size()}, Tensor[](moments1_out){params.size()}, Tensor[](moments2_out){params.size()}, Tensor[](beta1_pows_out){params.size()}, Tensor[](beta2_pows_out){params.size()}, Tensor[](master_params_out){params.size()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在legacy_ops.yaml里不要新增定义,这里面的定义是为动态图算子服务的

@@ -741,6 +741,17 @@
data_type : dtype
interfaces : paddle::dialect::InferSymbolicShapeInterface

- op : fused_adam
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- op : fused_adam
- op : fused_adam_

这里我们要和动态图算子保持一致,另外这个算子只有inplace版本,对PIR,所有下划线结尾的都是inplace算子

@@ -143,6 +143,7 @@
'dpsgd',
'embedding_grad_sparse',
'ftrl',
'fused_adam',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'fused_adam',
'fused_adam_',

@Eddie-Wang1120
Copy link
Contributor Author

这样改应该就行了。关于之前为什么会重定义,原因可能与legacy.yaml里同时定义了fused_adam和fused_adam_有关,但是这两个的名字也不冲突。一个猜测是fused_adam定义了inplace,于是生成的API名是fused_adam_导致冲突,具体这里麻烦 @zyfncg 帮忙解释下。

感谢感谢!这里按您说的改完后又补充了optional的master_params_out就可以通过单侧了,学到很多,非常感谢您!

@Eddie-Wang1120
Copy link
Contributor Author

请问这里CI中PR-CI-Static-Check无法通过是因为我在ops.yaml中的optional中增加了master_params_out吗,这里如果不添加的话会报错,报错信息如下:
1710900974669

@kangguangli
Copy link
Contributor

请问这里CI中PR-CI-Static-Check无法通过是因为我在ops.yaml中的optional中增加了master_params_out吗,这里如果不添加的话会报错,报错信息如下: 1710900974669

增加optional是没问题的,static_check_ci没有通过是因为我们现在新增API导致的,这个我找相关同事approve下即可。

Copy link
Contributor

@sunzhongkai588 sunzhongkai588 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jeff41404 jeff41404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for API

@kangguangli kangguangli merged commit 66479b9 into PaddlePaddle:develop Mar 20, 2024
30 checks passed
@Eddie-Wang1120 Eddie-Wang1120 deleted the eddie-dev-adam branch March 21, 2024 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers HappyOpenSource 快乐开源活动issue与PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants