Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FloatingPointError: Loss became infinite or NaN at iteration=0! #16

Open
niushou opened this issue Nov 15, 2022 · 3 comments
Open

FloatingPointError: Loss became infinite or NaN at iteration=0! #16

niushou opened this issue Nov 15, 2022 · 3 comments

Comments

@niushou
Copy link

niushou commented Nov 15, 2022

The loss_occ_cls of the first iteration is 0

[11/15 02:06:37 adet.trainer]: Starting training from iteration 0
Traceback (most recent call last):
File "train_net.py", line 303, in
args=(args,),
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "train_net.py", line 286, in main
return trainer.train()
File "train_net.py", line 83, in train
self.train_loop(self.start_iter, self.max_iter)
File "train_net.py", line 73, in train_loop
self.run_step()
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 494, in run_step
self._trainer.run_step()
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 287, in run_step
self._write_metrics(loss_dict, data_time)
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 302, in _write_metrics
SimpleTrainer.write_metrics(loss_dict, data_time, prefix)
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 339, in write_metrics
f"Loss became infinite or NaN at iteration={storage.iter}!\n"
FloatingPointError: Loss became infinite or NaN at iteration=0!
loss_dict = {'loss_cls': 157.09115600585938, 'loss_box_reg': 5.162332534790039, 'loss_visible_mask': 3.1945271492004395, 'loss_amodal_mask': 2.944978952407837, 'loss_occ_cls': nan, 'loss_rpn_cls': 9.696294784545898, 'loss_rpn_loc': 12.890896797180176}

@SeungBack
Copy link
Contributor

Please make sure that cfg.SOLVER.CLIP_GRADIENT.ENABLED in the config file is set to True to prevent gradient explosion.

@Lilzhuzixi
Copy link

halo friend! Did you work out this error? Can you give me some advice?

@niushou
Copy link
Author

niushou commented May 22, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants