Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Llama3 support to llama_adapter #147

Merged
merged 1 commit into from
May 14, 2024

Conversation

radhikamp99
Copy link
Contributor

@radhikamp99 radhikamp99 commented May 13, 2024

Adding support for llama 3 models via the existing llama model adapter - there are no architectural changes since llama 2

Models now supported, and added to the README:

Ran test_model_adapter.py and all tests passed. Ran slice gpt and finetuning experiments and evaluated to get the following results:

model: Meta-Llama-3-8B
piqa: originial: 0.8079, sliced@25% 0.5871, recovery finetuned: 0.6817

I was unable test out the 70B models due to memory constraints.

@radhikamp99 radhikamp99 force-pushed the radhika/llama_3_adapter branch from 5505253 to 060935d Compare May 13, 2024 12:30
Copy link
Collaborator

@nailimixaM nailimixaM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant, thanks so much for this Radhika! Two minor requests:

  • If you could update the README to say we now support these models too, that would be great
  • Could you run a slicing + finetuning experiment and report back numbers, like Pashmina did in her Phi-3 PR?

@radhikamp99
Copy link
Contributor Author

Brilliant, thanks so much for this Radhika! Two minor requests:

  • If you could update the README to say we now support these models too, that would be great
  • Could you run a slicing + finetuning experiment and report back numbers, like Pashmina did in her Phi-3 PR?

I was mid-way editing the PR description, apologies if it was confusing - currently running the experiments and will add the numbers to the PR and mark as ready for review when done. I've added the list of supported models to the README, is there something else we need to add there?

@radhikamp99 radhikamp99 marked this pull request as ready for review May 13, 2024 13:08
@radhikamp99
Copy link
Contributor Author

@nailimixaM Added the piqa results above!

@pashminacameron
Copy link
Contributor

Suggest changing the title of the PR to "Add Llama3 support to llama_adapter"

@radhikamp99 radhikamp99 changed the title Add support for llama3 adapter Add Llama3 support to llama_adapter May 13, 2024
@nailimixaM
Copy link
Collaborator

@nailimixaM Added the piqa results above!

Thanks, is this with wikitext or alpaca for slicing and finetuning?

@radhikamp99
Copy link
Contributor Author

@nailimixaM Added the piqa results above!

Thanks, is this with wikitext or alpaca for slicing and finetuning?

I used the default set-up, so would be wikitext2

@radhikamp99 radhikamp99 force-pushed the radhika/llama_3_adapter branch from 060935d to c314f66 Compare May 13, 2024 17:06
@pashminacameron pashminacameron merged commit d74a61d into microsoft:main May 14, 2024
2 checks passed
nailimixaM added a commit that referenced this pull request Jun 18, 2024
* Update dependencies (#144)

* Separate out dependencies for experiments

* Raise peft version

* Bump transformers

* Bump datasets

* Update README

* Rollback peft to 0.6.0 and make it optional

* Use a task metric map in lm_eval runner (#146)

* Add Phi-3-mini adapter (#145)

* Add Phi-3 adapter

* Removed cast. Aligned type with base class.

* Add support for llama3 adapter (#147)

* Update transformers to 4.41.0 (#150)

* Update transformers to latest

* Spaces

* Point to bug fix commit we want to pick

* Update pyproject.toml

* update README

* update

---------

Co-authored-by: Dmitry Kats <dmitrykats@microsoft.com>
Co-authored-by: Pashmina Cameron <pcameron@microsoft.com>
Co-authored-by: radhikamp99 <47057131+radhikamp99@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants