FloatingPointError: Loss became infinite or NaN at iteration=0! #16

niushou · 2022-11-15T02:26:09Z

The loss_occ_cls of the first iteration is 0

[11/15 02:06:37 adet.trainer]: Starting training from iteration 0
Traceback (most recent call last):
File "train_net.py", line 303, in
args=(args,),
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "train_net.py", line 286, in main
return trainer.train()
File "train_net.py", line 83, in train
self.train_loop(self.start_iter, self.max_iter)
File "train_net.py", line 73, in train_loop
self.run_step()
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 494, in run_step
self._trainer.run_step()
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 287, in run_step
self._write_metrics(loss_dict, data_time)
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 302, in _write_metrics
SimpleTrainer.write_metrics(loss_dict, data_time, prefix)
File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 339, in write_metrics
f"Loss became infinite or NaN at iteration={storage.iter}!\n"
FloatingPointError: Loss became infinite or NaN at iteration=0!
loss_dict = {'loss_cls': 157.09115600585938, 'loss_box_reg': 5.162332534790039, 'loss_visible_mask': 3.1945271492004395, 'loss_amodal_mask': 2.944978952407837, 'loss_occ_cls': nan, 'loss_rpn_cls': 9.696294784545898, 'loss_rpn_loc': 12.890896797180176}

SeungBack · 2023-01-31T11:21:47Z

Please make sure that cfg.SOLVER.CLIP_GRADIENT.ENABLED in the config file is set to True to prevent gradient explosion.

Lilzhuzixi · 2024-05-22T13:54:06Z

halo friend! Did you work out this error? Can you give me some advice?

niushou · 2024-05-22T13:58:45Z

Hello friend, this problem is caused by the deep learning environment. You need to make sure that your device environment is consistent with the code requirements, and replace it with a GPU with more memory.

…

---Original--- From: ***@***.***> Date: Wed, May 22, 2024 21:54 PM To: ***@***.***>; Cc: ***@***.******@***.***>; Subject: Re: [gist-ailab/uoais] FloatingPointError: Loss became infinite orNaN at iteration=0! (Issue #16) halo friend! Did you work out this error? Can you give me some advice? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FloatingPointError: Loss became infinite or NaN at iteration=0! #16

FloatingPointError: Loss became infinite or NaN at iteration=0! #16

niushou commented Nov 15, 2022

SeungBack commented Jan 31, 2023

Lilzhuzixi commented May 22, 2024

niushou commented May 22, 2024 via email

FloatingPointError: Loss became infinite or NaN at iteration=0! #16

FloatingPointError: Loss became infinite or NaN at iteration=0! #16

Comments

niushou commented Nov 15, 2022

SeungBack commented Jan 31, 2023

Lilzhuzixi commented May 22, 2024

niushou commented May 22, 2024 via email