We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
作者你好,我在训练你的网络的时候,在迭代过程中遇到了loss为nan的问题,无法正常训练,请问这要怎么解决呢 2023-06-16 21:58:43,765 INFO: [debug..][epoch: 0, iter: 151, lr:(2.000e-04,2.000e-05,)] [eta: 395 days, 5:42:49, time (data): 0.489 (0.001)] l_pix: -3.0743e+01 2023-06-16 21:58:44,519 INFO: [debug..][epoch: 0, iter: 152, lr:(2.000e-04,2.000e-05,)] [eta: 392 days, 15:56:30, time (data): 0.482 (0.001)] l_pix: -2.8084e+01 2023-06-16 21:58:44,519 INFO: Saving models and training states. Test 00001088: 100%|██████████| 1089/1089 [23:56<00:00, 1.29s/image] 2023-06-16 22:22:48,051 INFO: Validation debug, # psnr: 28.4838 # ssim: 0.9036 2023-06-16 22:22:48,546 INFO: [debug..][epoch: 0, iter: 153, lr:(2.000e-04,2.000e-05,)] [eta: 411 days, 19:13:58, time (data): 0.490 (0.001)] l_pix: -3.0197e+01 2023-06-16 22:22:49,036 INFO: [debug..][epoch: 0, iter: 154, lr:(2.000e-04,2.000e-05,)] [eta: 409 days, 3:35:46, time (data): 0.489 (0.001)] l_pix: -2.7506e+01 2023-06-16 22:22:49,505 INFO: [debug..][epoch: 0, iter: 155, lr:(2.000e-04,2.000e-05,)] [eta: 406 days, 12:46:06, time (data): 0.470 (0.001)] l_pix: inf 2023-06-16 22:22:49,988 INFO: [debug..][epoch: 0, iter: 156, lr:(2.000e-04,2.000e-05,)] [eta: 403 days, 22:44:43, time (data): 0.482 (0.002)] l_pix: nan 2023-06-16 22:22:50,474 INFO: [debug..][epoch: 0, iter: 157, lr:(2.000e-04,2.000e-05,)] [eta: 401 days, 9:30:33, time (data): 0.487 (0.002)] l_pix: nan 2023-06-16 22:22:50,938 INFO: [debug..][epoch: 0, iter: 158, lr:(2.000e-04,2.000e-05,)] [eta: 398 days, 21:02:06, time (data): 0.464 (0.001)] l_pix: nan 2023-06-16 22:22:51,750 INFO: [debug..][epoch: 0, iter: 159, lr:(2.000e-04,2.000e-05,)] [eta: 396 days, 9:26:15, time (data): 0.487 (0.001)] l_pix: nan 2023-06-16 22:22:52,240 INFO: [debug..][epoch: 0, iter: 160, lr:(2.000e-04,2.000e-05,)] [eta: 393 days, 22:28:10, time (data): 0.490 (0.002)] l_pix: nan 2023-06-16 22:22:52,240 INFO: Saving models and training states. Test 00001088: 100%|██████████| 1089/1089 [21:33<00:00, 1.19s/image] 2023-06-16 22:44:32,789 INFO: Validation debug, # psnr: -42.1933 # ssim: 0.0002 2023-06-16 22:44:33,276 INFO: [debug..][epoch: 0, iter: 161, lr:(2.000e-04,2.000e-05,)] [eta: 410 days, 1:52:18, time (data): 0.481 (0.001)] l_pix: nan 2023-06-16 22:44:33,763 INFO: [debug..][epoch: 0, iter: 162, lr:(2.000e-04,2.000e-05,)] [eta: 407 days, 13:36:31, time (data): 0.485 (0.001)] l_pix: nan
The text was updated successfully, but these errors were encountered:
No branches or pull requests
作者你好,我在训练你的网络的时候,在迭代过程中遇到了loss为nan的问题,无法正常训练,请问这要怎么解决呢
2023-06-16 21:58:43,765 INFO: [debug..][epoch: 0, iter: 151, lr:(2.000e-04,2.000e-05,)] [eta: 395 days, 5:42:49, time (data): 0.489 (0.001)] l_pix: -3.0743e+01
2023-06-16 21:58:44,519 INFO: [debug..][epoch: 0, iter: 152, lr:(2.000e-04,2.000e-05,)] [eta: 392 days, 15:56:30, time (data): 0.482 (0.001)] l_pix: -2.8084e+01
2023-06-16 21:58:44,519 INFO: Saving models and training states.
Test 00001088: 100%|██████████| 1089/1089 [23:56<00:00, 1.29s/image]
2023-06-16 22:22:48,051 INFO: Validation debug, # psnr: 28.4838 # ssim: 0.9036
2023-06-16 22:22:48,546 INFO: [debug..][epoch: 0, iter: 153, lr:(2.000e-04,2.000e-05,)] [eta: 411 days, 19:13:58, time (data): 0.490 (0.001)] l_pix: -3.0197e+01
2023-06-16 22:22:49,036 INFO: [debug..][epoch: 0, iter: 154, lr:(2.000e-04,2.000e-05,)] [eta: 409 days, 3:35:46, time (data): 0.489 (0.001)] l_pix: -2.7506e+01
2023-06-16 22:22:49,505 INFO: [debug..][epoch: 0, iter: 155, lr:(2.000e-04,2.000e-05,)] [eta: 406 days, 12:46:06, time (data): 0.470 (0.001)] l_pix: inf
2023-06-16 22:22:49,988 INFO: [debug..][epoch: 0, iter: 156, lr:(2.000e-04,2.000e-05,)] [eta: 403 days, 22:44:43, time (data): 0.482 (0.002)] l_pix: nan
2023-06-16 22:22:50,474 INFO: [debug..][epoch: 0, iter: 157, lr:(2.000e-04,2.000e-05,)] [eta: 401 days, 9:30:33, time (data): 0.487 (0.002)] l_pix: nan
2023-06-16 22:22:50,938 INFO: [debug..][epoch: 0, iter: 158, lr:(2.000e-04,2.000e-05,)] [eta: 398 days, 21:02:06, time (data): 0.464 (0.001)] l_pix: nan
2023-06-16 22:22:51,750 INFO: [debug..][epoch: 0, iter: 159, lr:(2.000e-04,2.000e-05,)] [eta: 396 days, 9:26:15, time (data): 0.487 (0.001)] l_pix: nan
2023-06-16 22:22:52,240 INFO: [debug..][epoch: 0, iter: 160, lr:(2.000e-04,2.000e-05,)] [eta: 393 days, 22:28:10, time (data): 0.490 (0.002)] l_pix: nan
2023-06-16 22:22:52,240 INFO: Saving models and training states.
Test 00001088: 100%|██████████| 1089/1089 [21:33<00:00, 1.19s/image]
2023-06-16 22:44:32,789 INFO: Validation debug, # psnr: -42.1933 # ssim: 0.0002
2023-06-16 22:44:33,276 INFO: [debug..][epoch: 0, iter: 161, lr:(2.000e-04,2.000e-05,)] [eta: 410 days, 1:52:18, time (data): 0.481 (0.001)] l_pix: nan
2023-06-16 22:44:33,763 INFO: [debug..][epoch: 0, iter: 162, lr:(2.000e-04,2.000e-05,)] [eta: 407 days, 13:36:31, time (data): 0.485 (0.001)] l_pix: nan
The text was updated successfully, but these errors were encountered: