Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问训练中断之后可以继续训练吗 #32

Open
ly27253 opened this issue May 30, 2023 · 3 comments
Open

请问训练中断之后可以继续训练吗 #32

ly27253 opened this issue May 30, 2023 · 3 comments

Comments

@ly27253
Copy link

ly27253 commented May 30, 2023

首先,非常感谢作者的付出以及无私分享,这是一项伟大杰出的工程。

我想请教下在训练意外中断之后是否可以继续训练,或者说如何恢复训练

@wtyuan96
Copy link
Collaborator

可以的,请在训练的时候加上 --resume 参数。

@ly27253
Copy link
Author

ly27253 commented May 31, 2023

非诚感谢您在百忙之中给予回复!我还有一些问题希望向您请教,请问此工程是否提供了训练过程的日志文件,例如下列内容:

Epoch 0/16, Iter 0/27097, lr 0.000335, train loss = 19.370, depth loss = 128.569, entropy loss = 19.370, time = 9.277
Epoch 0/16, Iter 10/27097, lr 0.000348, train loss = 17.496, depth loss = 82.729, entropy loss = 17.496, time = 0.652
Epoch 0/16, Iter 20/27097, lr 0.000361, train loss = 16.136, depth loss = 76.066, entropy loss = 16.136, time = 0.654
Epoch 0/16, Iter 30/27097, lr 0.000375, train loss = 14.857, depth loss = 75.840, entropy loss = 14.857, time = 0.666

Epoch 0/1, Iter 6090/6174, test loss = 32.119, depth loss = 324.484, entropy loss = 32.119, time = 0.151517
Epoch 0/1, Iter 6100/6174, test loss = 31.801, depth loss = 367.143, entropy loss = 31.801, time = 0.153087
Epoch 0/1, Iter 6110/6174, test loss = 31.482, depth loss = 359.672, entropy loss = 31.482, time = 0.153128
Epoch 0/1, Iter 6120/6174, test loss = 30.979, depth loss = 350.101, entropy loss = 30.979, time = 0.165338
Epoch 0/1, Iter 6130/6174, test loss = 30.580, depth loss = 349.963, entropy loss = 30.580, time = 0.260605
Epoch 0/1, Iter 6140/6174, test loss = 30.033, depth loss = 353.722, entropy loss = 30.033, time = 0.271916
Epoch 0/1, Iter 6150/6174, test loss = 29.811, depth loss = 351.325, entropy loss = 29.811, time = 0.275013
Epoch 0/1, Iter 6160/6174, test loss = 29.617, depth loss = 354.346, entropy loss = 29.617, time = 0.267842
Epoch 0/1, Iter 6170/6174, test loss = 32.595, depth loss = 375.382, entropy loss = 32.595, time = 0.267094

我希望通过自己的训练过程和原工程训练输出进行对比,以验证自己的思路,如果有的话请问方便补充提供吗?还要冒昧请问:我在训练阶段,例如Epoch 0/1, Iter 6170/6174等阶段出现了loss较大的现象,这个属于正常表现吗?

@EllenYiGe
Copy link

非诚感谢您在百忙之中给予回复!我还有一些问题希望向您请教,请问此工程是否提供了训练过程的日志文件,例如下列内容:

Epoch 0/16, Iter 0/27097, lr 0.000335, train loss = 19.370, depth loss = 128.569, entropy loss = 19.370, time = 9.277 Epoch 0/16, Iter 10/27097, lr 0.000348, train loss = 17.496, depth loss = 82.729, entropy loss = 17.496, time = 0.652 Epoch 0/16, Iter 20/27097, lr 0.000361, train loss = 16.136, depth loss = 76.066, entropy loss = 16.136, time = 0.654 Epoch 0/16, Iter 30/27097, lr 0.000375, train loss = 14.857, depth loss = 75.840, entropy loss = 14.857, time = 0.666

Epoch 0/1, Iter 6090/6174, test loss = 32.119, depth loss = 324.484, entropy loss = 32.119, time = 0.151517 Epoch 0/1, Iter 6100/6174, test loss = 31.801, depth loss = 367.143, entropy loss = 31.801, time = 0.153087 Epoch 0/1, Iter 6110/6174, test loss = 31.482, depth loss = 359.672, entropy loss = 31.482, time = 0.153128 Epoch 0/1, Iter 6120/6174, test loss = 30.979, depth loss = 350.101, entropy loss = 30.979, time = 0.165338 Epoch 0/1, Iter 6130/6174, test loss = 30.580, depth loss = 349.963, entropy loss = 30.580, time = 0.260605 Epoch 0/1, Iter 6140/6174, test loss = 30.033, depth loss = 353.722, entropy loss = 30.033, time = 0.271916 Epoch 0/1, Iter 6150/6174, test loss = 29.811, depth loss = 351.325, entropy loss = 29.811, time = 0.275013 Epoch 0/1, Iter 6160/6174, test loss = 29.617, depth loss = 354.346, entropy loss = 29.617, time = 0.267842 Epoch 0/1, Iter 6170/6174, test loss = 32.595, depth loss = 375.382, entropy loss = 32.595, time = 0.267094

我希望通过自己的训练过程和原工程训练输出进行对比,以验证自己的思路,如果有的话请问方便补充提供吗?还要冒昧请问:我在训练阶段,例如Epoch 0/1, Iter 6170/6174等阶段出现了loss较大的现象,这个属于正常表现吗?

你好,我也在训练这个,你这个testloss太大了.不过我这里出现了test loss周期性振荡现象

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants