Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Implementation of AdaLoRA (ICLR 2023) #233

Merged
merged 30 commits into from
Apr 6, 2023

Conversation

QingruZhang
Copy link
Contributor

Dear PEFT maintainers,
This is Qingru Zhang, the author of "Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning" (ICLR 2023, please
see the link). We would like to submit this PR to intergrate AdaLoRA into PEFT. It was a great discussion with Sourab about the implementation of AdaLoRA and its intergration into PEFT. Thanks a lot for Sourab's comments and support during we prepare this PR. It would be great to have AdaLoRA available in PEFT! Please let me know in case of any questions about the impelmentation.

Thanks,
Qingru

Copy link
Contributor

@pacman100 pacman100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome 🔥. Well done @QingruZhang and Thank you for making AdaLoRA easy to use for the community 🤗. LGTM!

Left a few comments and suggestions. could you also run make style and make quality to fix the code quality CI.

src/peft/tuners/adalora.py Show resolved Hide resolved
src/peft/tuners/adalora.py Outdated Show resolved Hide resolved
src/peft/tuners/adalora.py Outdated Show resolved Hide resolved
src/peft/tuners/adalora.py Outdated Show resolved Hide resolved
src/peft/mapping.py Outdated Show resolved Hide resolved
src/peft/tuners/adalora.py Outdated Show resolved Hide resolved
src/peft/tuners/adalora.py Outdated Show resolved Hide resolved
QingruZhang and others added 7 commits April 5, 2023 16:23
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 5, 2023

The documentation is not available anymore as the PR was closed or merged.

@QingruZhang QingruZhang requested a review from pacman100 April 5, 2023 21:51
@pacman100
Copy link
Contributor

pacman100 commented Apr 6, 2023

Hello @QingruZhang, I applied AdaLoRA to Whisper large fine-tuning, here is the wandb run

  1. There is Improvement in normalized wer (2.4% relative improvement, matching the fully fine-tuned upto first decimal place) in comparison to LoRA. However, no improvement in wer.
  2. Interestingly, it preserved a lot more trainable params in encoder than decoder.
  3. In decoder, fc1 was the most important target layer to add the low rank matrices.
  4. Final trainable params post the budget aware adalora tuning: trainable params: 15520256 || all params: 1558825701 || trainable%: 0.9956376771337311 . For LoRA it is trainable params: 15728640 || all params: 1559033600 || trainable%: 1.0088711365810203 . So, a little less trainable params than LoRA after the budget aware pruning.

Screenshot 2023-04-06 at 11 07 15 AM

Copy link
Contributor

@pacman100 pacman100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @QingruZhang for iterating, LGTM! 🤗

@pacman100 pacman100 merged commit a7d5e51 into huggingface:main Apr 6, 2023
@QingruZhang
Copy link
Contributor Author

Hello @pacman100 , thanks for merging the commmits and running the test for AdaLoRA! Typically, we should set the initial budget as 1.5 times of final target budget and tune the budget schedule to have enough final fine-tuning steps to get the good performance. Please let me know if there are more experimental tests I need to do. Thanks agian for your help during this process!

@chenweizhu
Copy link

chenweizhu commented Apr 26, 2023

hi @pacman100 , do you also measure the peak GPU memory consumption, training time and other metrics?

it would also be interesting to compare all these metrics, including the quality, when we set the budget as 1.5X or 2X.

@chenweizhu
Copy link

chenweizhu commented Apr 26, 2023

and how about run some test on the Llama 7B or 13B model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants