example.ipynb中进行训练测试loss为nan #32

SilentMoebuta · 2023-04-20T08:23:05Z

前面代码不变

model(**batch).loss

改为

for i in range(10):
    output = model(**batch)
    loss = output.loss
    loss.backward()
    optimizer.step()
    lr_scheduler.step()
    optimizer.zero_grad()
    print(loss.detach().float())

输出

tensor(3.2207, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')

只有第1个step时loss正常计算，请问这是为啥？
lr设置1e-8

The text was updated successfully, but these errors were encountered:

iridescentee · 2023-04-20T19:53:58Z

I have met the same problem. Did you find the solution?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

example.ipynb中进行训练测试loss为nan #32

example.ipynb中进行训练测试loss为nan #32

SilentMoebuta commented Apr 20, 2023

iridescentee commented Apr 20, 2023

example.ipynb中进行训练测试loss为nan #32

example.ipynb中进行训练测试loss为nan #32

Comments

SilentMoebuta commented Apr 20, 2023

iridescentee commented Apr 20, 2023