You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to train llama-58m from your code.
But, the models used as teacher in your code are LLAMA-58M, LLAMA-360M and GPT2-705M.
Do I need to use all these 3 models to train one student?
The text was updated successfully, but these errors were encountered:
I am trying to train llama-58m from your code. But, the models used as teacher in your code are LLAMA-58M, LLAMA-360M and GPT2-705M. Do I need to use all these 3 models to train one student?
You only need LLaMA-360M and GPT2-705M, you can pretrain both of them first.
Thank you for your work.
I am trying to train llama-58m from your code.
But, the models used as teacher in your code are LLAMA-58M, LLAMA-360M and GPT2-705M.
Do I need to use all these 3 models to train one student?
The text was updated successfully, but these errors were encountered: