Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练时loss为nan #12

Open
TanLuuu opened this issue Jun 17, 2023 · 0 comments
Open

训练时loss为nan #12

TanLuuu opened this issue Jun 17, 2023 · 0 comments

Comments

@TanLuuu
Copy link

TanLuuu commented Jun 17, 2023

作者你好,我在训练你的网络的时候,在迭代过程中遇到了loss为nan的问题,无法正常训练,请问这要怎么解决呢
2023-06-16 21:58:43,765 INFO: [debug..][epoch: 0, iter: 151, lr:(2.000e-04,2.000e-05,)] [eta: 395 days, 5:42:49, time (data): 0.489 (0.001)] l_pix: -3.0743e+01
2023-06-16 21:58:44,519 INFO: [debug..][epoch: 0, iter: 152, lr:(2.000e-04,2.000e-05,)] [eta: 392 days, 15:56:30, time (data): 0.482 (0.001)] l_pix: -2.8084e+01
2023-06-16 21:58:44,519 INFO: Saving models and training states.
Test 00001088: 100%|██████████| 1089/1089 [23:56<00:00, 1.29s/image]
2023-06-16 22:22:48,051 INFO: Validation debug, # psnr: 28.4838 # ssim: 0.9036
2023-06-16 22:22:48,546 INFO: [debug..][epoch: 0, iter: 153, lr:(2.000e-04,2.000e-05,)] [eta: 411 days, 19:13:58, time (data): 0.490 (0.001)] l_pix: -3.0197e+01
2023-06-16 22:22:49,036 INFO: [debug..][epoch: 0, iter: 154, lr:(2.000e-04,2.000e-05,)] [eta: 409 days, 3:35:46, time (data): 0.489 (0.001)] l_pix: -2.7506e+01
2023-06-16 22:22:49,505 INFO: [debug..][epoch: 0, iter: 155, lr:(2.000e-04,2.000e-05,)] [eta: 406 days, 12:46:06, time (data): 0.470 (0.001)] l_pix: inf
2023-06-16 22:22:49,988 INFO: [debug..][epoch: 0, iter: 156, lr:(2.000e-04,2.000e-05,)] [eta: 403 days, 22:44:43, time (data): 0.482 (0.002)] l_pix: nan
2023-06-16 22:22:50,474 INFO: [debug..][epoch: 0, iter: 157, lr:(2.000e-04,2.000e-05,)] [eta: 401 days, 9:30:33, time (data): 0.487 (0.002)] l_pix: nan
2023-06-16 22:22:50,938 INFO: [debug..][epoch: 0, iter: 158, lr:(2.000e-04,2.000e-05,)] [eta: 398 days, 21:02:06, time (data): 0.464 (0.001)] l_pix: nan
2023-06-16 22:22:51,750 INFO: [debug..][epoch: 0, iter: 159, lr:(2.000e-04,2.000e-05,)] [eta: 396 days, 9:26:15, time (data): 0.487 (0.001)] l_pix: nan
2023-06-16 22:22:52,240 INFO: [debug..][epoch: 0, iter: 160, lr:(2.000e-04,2.000e-05,)] [eta: 393 days, 22:28:10, time (data): 0.490 (0.002)] l_pix: nan
2023-06-16 22:22:52,240 INFO: Saving models and training states.
Test 00001088: 100%|██████████| 1089/1089 [21:33<00:00, 1.19s/image]
2023-06-16 22:44:32,789 INFO: Validation debug, # psnr: -42.1933 # ssim: 0.0002
2023-06-16 22:44:33,276 INFO: [debug..][epoch: 0, iter: 161, lr:(2.000e-04,2.000e-05,)] [eta: 410 days, 1:52:18, time (data): 0.481 (0.001)] l_pix: nan
2023-06-16 22:44:33,763 INFO: [debug..][epoch: 0, iter: 162, lr:(2.000e-04,2.000e-05,)] [eta: 407 days, 13:36:31, time (data): 0.485 (0.001)] l_pix: nan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant