Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the teacher model #2

Open
JJongyn opened this issue Mar 4, 2024 · 1 comment
Open

Using the teacher model #2

JJongyn opened this issue Mar 4, 2024 · 1 comment

Comments

@JJongyn
Copy link

JJongyn commented Mar 4, 2024

Thank you for your work.

I am trying to train llama-58m from your code.
But, the models used as teacher in your code are LLAMA-58M, LLAMA-360M and GPT2-705M.
Do I need to use all these 3 models to train one student?

@shawnricecake
Copy link
Owner

Thank you for your work.

I am trying to train llama-58m from your code. But, the models used as teacher in your code are LLAMA-58M, LLAMA-360M and GPT2-705M. Do I need to use all these 3 models to train one student?

You only need LLaMA-360M and GPT2-705M, you can pretrain both of them first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants