Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Task Arithmetics #698

Merged
merged 20 commits into from
Aug 2, 2024

Conversation

lenglaender
Copy link
Member

@lenglaender lenglaender commented May 8, 2024

This PR adds support for various task arithmetic options for LoRA. Until now, our library supported averaging only by linearly combining different adapters. This may be insufficient, especially for LoRA — hence, several publications have proposed other ways to perform task arithmetic.

This PR:

  • makes it easier to implement different weighting methods
  • adds 2 additional merging methods for LoRA following these papers
  • adds method to merge heads
  • Docu & notebook

@lenglaender lenglaender changed the title WIP: Add support for Task Arithmetics Add support for Task Arithmetics Jul 4, 2024
@lenglaender lenglaender marked this pull request as ready for review July 4, 2024 11:54
@lenglaender lenglaender requested review from calpt, TimoImhof and hSterz and removed request for calpt and TimoImhof July 10, 2024 12:34
Copy link
Member

@calpt calpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good overall!

Looked over everything except for the notebook and left some comments.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
docs/adapter_composition.md Outdated Show resolved Hide resolved
src/adapters/methods/lora.py Show resolved Hide resolved
src/adapters/model_mixin.py Outdated Show resolved Hide resolved
src/adapters/model_mixin.py Outdated Show resolved Hide resolved
src/adapters/model_mixin.py Outdated Show resolved Hide resolved
tests/methods/test_lora.py Outdated Show resolved Hide resolved
tests/methods/test_lora.py Outdated Show resolved Hide resolved
Copy link
Member

@hSterz hSterz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I just have some small questions

src/adapters/methods/bottleneck.py Show resolved Hide resolved
src/adapters/models/deberta/mixin_deberta.py Show resolved Hide resolved
\Phi_{merged} = \sum_{i=0}^{N} \lambda_i \Phi_i
$$

2. `combine_strategy = "lora_linear_only_negate_b"` Following [Zhang et al. (2023)](https://proceedings.neurips.cc/paper_files/paper/2023/hash/299a08ee712d4752c890938da99a77c6-Abstract-Conference.html), this method only uses negative weights for the B-matrix if the weight is negative:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only in the name is redundant. I would remove it to make it shorter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is best to keep it. Because if we simply call it lora_linear_negate_b, it sounds like the B matrix is always negated. But this method means that when the weights are negative, then we only negate the B matrix and not the A matrix.

docs/adapter_composition.md Outdated Show resolved Hide resolved
tests/test_adapter_heads.py Show resolved Hide resolved
lenglaender and others added 5 commits July 28, 2024 20:50
Co-authored-by: calpt <calpt@mail.de>
Co-authored-by: calpt <calpt@mail.de>
- move adapter merging to own docs page
- move `average_head` method from `ModelAdaptersMixin` to `ModelWithHeadsAdaptersMixin`
- In lora.py: Move SVD computation in helper function cause it was a bit to lengthy in the `average_adapter` function
- test_lora.py: split test cases
@lenglaender lenglaender requested a review from calpt July 29, 2024 09:20
Copy link
Member

@calpt calpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me!

notebooks/06_Task_Arithmetics.ipynb Show resolved Hide resolved
@lenglaender lenglaender merged commit 8ddbcc8 into adapter-hub:main Aug 2, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants