Skip to content

Is posible that Full finetune Llama3-8B on 2xA100 (40GB Vram per GPU)? #1166

Discussion options

You must be logged in to vote

I believe that you could even finetune it on only one A100 for the 8B model. But for two gpus, use the 8B_full config which is for distributed recipes. You can launch with tune run --nnodes=1 --nproc-per-node=2 full_finetune_distributed --config llama3/8B_full. You can also tune cp llama3/8B_full <my_config> to be able to modify the config for a batch size, learning rate, etc that it optimal for your compute setup.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by nguyenhoanganh2002
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants