Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于 find_unused_parameters 的含义和影响 #3105

Closed
linyangsdu opened this issue May 21, 2021 · 5 comments
Closed

关于 find_unused_parameters 的含义和影响 #3105

linyangsdu opened this issue May 21, 2021 · 5 comments
Assignees
Labels
training Training question

Comments

@linyangsdu
Copy link

我在训练过程中遇到了如下错误:

Traceback (most recent call last):
File "tools/train.py", line 140, in
main()
File "tools/train.py", line 136, in main
run(FLAGS, cfg)
File "tools/train.py", line 111, in run
trainer.train(FLAGS.eval)
File "/root/paddlejob/workspace/code/PaddleDetection/ppdet/engine/trainer.py", line 307, in train
outputs = model(data)
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 898, in call
outputs = self.forward(*inputs, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/parallel.py", line 581, in forward
list(self._find_varbase(outputs)))
RuntimeError: (PreconditionNotMet) A serious error has occurred here. Please set find_unused_parameters=True to traverse backward graph in each step to prepare reduce in advance. If you have set, There may be several reasons for this error: 1) Please note that all forward outputs derived from the module parameters must participate in the calculation of losses and subsequent gradient calculations. If not, the wrapper will hang, waiting for autograd to generate gradients for these parameters. you can use detach or stop_gradient to make the unused parameters detached from the autograd graph. 2) Used multiple forwards and one backward. You may be able to wrap multiple forwards in a model.

请问 find_unused_parameters 参数是什么含义,添加为True有什么影响,这个问题该如何解决?

@jerrywgz
Copy link
Collaborator

请问是哪个模型遇到了这个问题呢,目前可以在配置文件中设置find_unused_parameters, 可以参考https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/mot/jde/_base_/jde_darknet53.yml#L3
关于这个参数的含义和影响,可以参考这个pr中的说明PaddlePaddle/Paddle#32826

@jerrywgz jerrywgz added the training Training question label May 21, 2021
@linyangsdu
Copy link
Author

是在训练 cascadercnn r50 dcn的时候遇到的问题,在aistudio中训练的,链接为:
https://aistudio.baidu.com/studio/project/partial/verify/1849265/49507f7ab6544a4fad12a10436560701

@jerrywgz
Copy link
Collaborator

能否贴下添加find_unused_parameters后的报错信息呢

@jerrywgz jerrywgz self-assigned this May 24, 2021
@nemonameless
Copy link
Collaborator

nemonameless commented May 26, 2021

可以使用paddle2.1版本,否则之前的paddle版本可能没有这个find_unused_parameters,会报错

@paddle-bot-old
Copy link

Since this issue has not been updated for more than three months, it will be closed, if it is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
It is recommended to pull and try the latest code first.
由于该问题超过三个月未更新,将会被关闭,若问题未解决或有后续问题,请随时重新打开(建议先拉取最新代码进行尝试),我们会继续跟进。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training Training question
Projects
None yet
Development

No branches or pull requests

3 participants