You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@mellahysf Hi, essentially, we use the low-rank adaptation (LoRA) to fine-tune the LLMs, making it feasible on a single GPU with less memory. So if only 15GB memory is available, I suggest reducing the rank parameter --lora_r=4 while it may decrease the model performance, and setting the smaller --batch_size=64. Have a try. : )
@juyongjiang Thank you for this great work!
How to finetune the model using less memory?
I'm facing CUDA OOM while trying to finetune on google colab pro with T4 15 GB...
Thanks!
The text was updated successfully, but these errors were encountered: