-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: CUDA error: invalid argument #5
Comments
The reason is that the analyze_predictions function of deepnet_predictions file is not involved in the training process but in the test process. My advice is to ipdb at the 205 line of referit3d_net_utils file. Because of the recent business, I would find the error case in the following days. |
Thanks for your reply! I will try your advice and see if it works. |
Sorry, I push the wrong file, whose content is three_d_obejct.py. You could refer to this link: https://github.com/sega-hsj/MVT-3DVG/blob/main/referit3d/in_out/sr3d.py |
I have revised the content of sr3d.py |
OK!But I didn't find the way to slove the first error yet, I retrained a model, didn't work. |
It might be related with k in batch[k] |
Hello again!Thanks for your reply previously,I finally finished the train process.But recently when I try to run the test.sh,some errors come again.
I ran the test.sh as your Readme says,the file is:
SR3D_GPT='/home/sd/Harddisk/sba/BS/ViewRefer3D-main/referit3d/data/Sr3D_release.csv'
PATH_OF_SCANNET_FILE='/home/sd/Harddisk/sba/BS/ViewRefer3D-main/referit3d/data/scanresult/keep_all_points_with_global_scan_alignment/keep_all_points_with_global_scan_alignment.pkl'
PATH_OF_REFERIT3D_FILE=${SR3D_GPT}
PATH_OF_BERT='/home/sd/Harddisk/sba/BS/ViewRefer3D-main/referit3d/data/bert'
And the file can run for a while ,then it will break at the same place everytime :
100%|█████████▉| 1476/1478 [04:02<00:00, 6.34it/s]
100%|█████████▉| 1477/1478 [04:02<00:00, 6.32it/s]
100%|██████████| 1478/1478 [04:02<00:00, 7.01it/s]
100%|██████████| 1478/1478 [04:02<00:00, 6.09it/s]
0%| | 0/1478 [00:00<?, ?it/s]
0%| | 0/1478 [00:01<?, ?it/s]
And the error is :
Traceback (most recent call last):
File "/home/sd/Harddisk/sba/BS/ViewRefer3D-main/referit3d/scripts/train_referit3d.py", line 291, in
args, out_file=out_file,tokenizer=tokenizer)
File "/home/sd/Harddisk/sba/BS/ViewRefer3D-main/referit3d/analysis/deepnet_predictions.py", line 42, in analyze_predictions
net_stats = detailed_predictions_on_dataset(model, d_loader, args=args, device=device, FOR_VISUALIZATION=True,tokenizer=tokenizer)
File "/home/sd/anaconda3/envs/viewrefer/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/home/sd/Harddisk/sba/BS/ViewRefer3D-main/referit3d/models/referit3d_net_utils.py", line 205, in detailed_predictions_on_dataset
batch[k] = batch[k].to(device)
RuntimeError: CUDA error: invalid argument
Why the training process is smooth ,but the error occurs during the test?
The text was updated successfully, but these errors were encountered: