Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Irregular RAM usage under large amount of epochs on gpu #872

Open
jdksjfisdf opened this issue Sep 2, 2024 · 2 comments
Open

Irregular RAM usage under large amount of epochs on gpu #872

jdksjfisdf opened this issue Sep 2, 2024 · 2 comments

Comments

@jdksjfisdf
Copy link

I was doing my learning which needs many single_train_steps on my gpu(NVIDIA GeForce RTX 2060 Mobile) when I noticed the irregular RAM(not video RAM) usage. I tested by modifying the epochs of https://lux.csail.mit.edu/stable/tutorials/beginner/2_PolynomialFitting from 250 to 2 500 000 and the RAM usage(of the single process, provided by kde system monitor) is increasing by time still it's up to 4.6 GB. The same issue does not happen if I disable LuxCUDA and run it on cpu. I think there is a memory leak.

@avik-pal
Copy link
Member

avik-pal commented Sep 4, 2024

I can reproduce this, but I don't think it is a memory leak. It is probably just Julia not freeing memory that it doesn't need to. I tried adding a GC.gc(true) at the end of the run and it was able to free all the memory, which (I think) wouldn't have been the case if it was a memory leak

Though 4.6 GB seems extremely high. I am running the job with very limited available memory (~2GB) and then the memory usage saturates at a certain point. Can you try adding a GC.gc(true) at the end of every epoch and see if the memory usage still grows?

@jdksjfisdf
Copy link
Author

I tested again. I was using jupyter lab and GC.gc(true) did not work for me. I runed GC.gc(true) every 50000 epochs and in the end. The memory usage is 4.5GB in the end. I don't know if jupyter or others matters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants