Training with GPU #119
Replies: 4 comments 22 replies
-
@chiku-parida At the first, you should update the code, and then set the num_workers=0, this is very important. Also note that do not use GPUs more than one card. I think the reason for this issue is that there are some numpy arrays and tensor operations in the code. When using the default device as GPU, the numpy array will be in the CPU, while the tensor will be in the GPU, which will cause this error. If you want to use multiple cards, you need to convert all numpy arrays into tensors to operate and ensure that they are assigned to the correct GPU. If this doesn't work, perhaps you need to set generator="cuda" in MGLDataLoader. I have already modified the code, so I forgot how to modify the parameters specifically. I hope it will be helpful to you |
Beta Was this translation helpful? Give feedback.
-
Pls refer to pytorch documentation on setting the default device. See https://pytorch.org/docs/stable/notes/cuda.html In general, if you wrap your entire code with |
Beta Was this translation helpful? Give feedback.
-
Hi! which python version you recommend to work with MatGL and cuda? |
Beta Was this translation helpful? Give feedback.
-
Training a MEGNet Formation Energy Model with PyTorch Lightning maybe stupid question! |
Beta Was this translation helpful? Give feedback.
-
I have tried initializing my GPU in the beginning like below.
'''
if torch.cuda.is_available():
device = 'cuda'
else:
device = 'cpu'
print(f'The available device is {device}')
'''
The model is detecting the GPU correctly still I don't understand which tensors should be assigned ti GPU. I am getting the below error. Please Help!
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Beta Was this translation helpful? Give feedback.
All reactions