-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Llama3 support to llama_adapter #147
Add Llama3 support to llama_adapter #147
Conversation
5505253
to
060935d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brilliant, thanks so much for this Radhika! Two minor requests:
- If you could update the README to say we now support these models too, that would be great
- Could you run a slicing + finetuning experiment and report back numbers, like Pashmina did in her Phi-3 PR?
I was mid-way editing the PR description, apologies if it was confusing - currently running the experiments and will add the numbers to the PR and mark as ready for review when done. I've added the list of supported models to the README, is there something else we need to add there? |
@nailimixaM Added the piqa results above! |
Suggest changing the title of the PR to "Add Llama3 support to llama_adapter" |
Thanks, is this with |
I used the default set-up, so would be wikitext2 |
060935d
to
c314f66
Compare
* Update dependencies (#144) * Separate out dependencies for experiments * Raise peft version * Bump transformers * Bump datasets * Update README * Rollback peft to 0.6.0 and make it optional * Use a task metric map in lm_eval runner (#146) * Add Phi-3-mini adapter (#145) * Add Phi-3 adapter * Removed cast. Aligned type with base class. * Add support for llama3 adapter (#147) * Update transformers to 4.41.0 (#150) * Update transformers to latest * Spaces * Point to bug fix commit we want to pick * Update pyproject.toml * update README * update --------- Co-authored-by: Dmitry Kats <dmitrykats@microsoft.com> Co-authored-by: Pashmina Cameron <pcameron@microsoft.com> Co-authored-by: radhikamp99 <47057131+radhikamp99@users.noreply.github.com>
Adding support for llama 3 models via the existing llama model adapter - there are no architectural changes since llama 2
Models now supported, and added to the README:
Ran
test_model_adapter.py
and all tests passed. Ran slice gpt and finetuning experiments and evaluated to get the following results:I was unable test out the 70B models due to memory constraints.