-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out Of Memory Error GPU #11
Comments
Looks like you literally run out of GPU memory. Geodock is a LLM and it uses a large amount of memory for the processing. For instance, I cannot run a prediction for a 300 aminocid protein docking to a 550 aminoacid protein in a 24Gb GPU.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi all,
I have an issue with the GPU memory. I'm using google colab with a A100 GPU, and apparently it is a GPU memory management issue, but I can't solve it. Could you help me?
When I run the prediction:
#@title Run Prediction
from geodock.GeoDockRunner import GeoDockRunner
torch.cuda.empty_cache()
ckpt_file = "/content/GeoDock/geodock/weights/dips_0.3.ckpt"
geodock = GeoDockRunner(ckpt_file=ckpt_file)
pred = geodock.dock(
partner1=partner1,
partner2=partner2,
out_name=out_name,
do_refine=do_refine,
use_openmm=True,
)
Appears this error:
OutOfMemoryError Traceback (most recent call last)
in <cell line: 6>()
4 ckpt_file = "/content/GeoDock/geodock/weights/dips_0.3.ckpt"
5 geodock = GeoDockRunner(ckpt_file=ckpt_file)
----> 6 pred = geodock.dock(
7 partner1=partner1,
8 partner2=partner2,
23 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in relu(input, inplace)
1469 result = torch.relu_(input)
1470 else:
-> 1471 result = torch.relu(input)
1472 return result
1473
OutOfMemoryError: CUDA out of memory. Tried to allocate 994.00 MiB. GPU 0 has a total capacty of 39.56 GiB of which 884.81 MiB is free. Process 85668 has 38.69 GiB memory in use. Of the allocated memory 37.87 GiB is allocated by PyTorch, and 336.05 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Thanks!
The text was updated successfully, but these errors were encountered: