Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new version Bone #2233

Merged
merged 5 commits into from
Nov 27, 2024
Merged

new version Bone #2233

merged 5 commits into from
Nov 27, 2024

Conversation

JL-er
Copy link
Contributor

@JL-er JL-er commented Nov 25, 2024

https://arxiv.org/abs/2409.15371
Bone has been updated to a new version. The current Bone is faster, more memory-efficient, and performs better than the Lora series, making it a foundational structure comparable to Lora. The previous version of Bone has now been renamed to Bat, and users can use it by including "bat" in init_weight.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, addressing some of the shortcomings of the initial Bone implementation. Just to be clear: What was previously called "Bone" now corresponds to "Bat" and the current "Bone" is something different (but similar). Am I right in my understanding that the new Bone is more memory efficient and faster than Bat but Bat performs better (tables 4 and 7)?

I added a few smaller comments to the PR. On top of that, let's ensure that both variants are covered by testing. For this, I think we need to add new rows with init_weights="bat" here:

("Vanilla MLP 1 Bone", "MLP", BoneConfig, {"target_modules": "lin0", "r": 2}),
("Vanilla MLP 2 Bone", "MLP", BoneConfig, {"target_modules": ["lin0"], "r": 2}),
("Vanilla MLP 3 Bone", "MLP", BoneConfig, {"target_modules": ["lin0", "lin1"], "r": 2}),
("Vanilla MLP 5 Bone", "MLP", BoneConfig, {"target_modules": ["lin0"], "modules_to_save": ["lin1"], "r": 2}),

Finally, WDYT about updating the Bone finetuning example to explicitly show results for Bone and Bat?

docs/source/conceptual_guides/adapter.md Outdated Show resolved Hide resolved
docs/source/conceptual_guides/adapter.md Outdated Show resolved Hide resolved
docs/source/conceptual_guides/adapter.md Outdated Show resolved Hide resolved
src/peft/tuners/bone/layer.py Outdated Show resolved Hide resolved
@JL-er
Copy link
Contributor Author

JL-er commented Nov 25, 2024

ddressing some of the shortcomings of the initial Bone implementation. Just to be clear: What was previously called "Bone" now corresponds to "Bat" and the current "Bone" is something different (but similar). Am I right in my understanding that the new Bone is more memory efficient and faster than Bat but Bat performs better (tables 4 and 7)?

Yes, your understanding is correct. Bone is now more like a foundational method comparable to Lora. Bat is more like an improved method compared to Pissa. However, Bone itself has already surpassed Pissa.

@JL-er
Copy link
Contributor Author

JL-er commented Nov 25, 2024

Finally, WDYT about updating the Bone finetuning example to explicitly show results for Bone and Bat?

I don't quite understand this issue. Currently, Bat is slower, so I think people would prefer to use Bone for training or research.
The other parts have been updated.

@BenjaminBossan
Copy link
Member

I don't quite understand this issue. Currently, Bat is slower, so I think people would prefer to use Bone for training or research.

That's true, but Bat has slightly better scores, right? So some users might be willing to trade memory+speed for better scores. If you think it's not really worth it, at least the example could mention that this is something that users can try out?

@JL-er
Copy link
Contributor Author

JL-er commented Nov 25, 2024

I don't quite understand this issue. Currently, Bat is slower, so I think people would prefer to use Bone for training or research.

That's true, but Bat has slightly better scores, right? So some users might be willing to trade memory+speed for better scores. If you think it's not really worth it, at least the example could mention that this is something that users can try out?

That's indeed the case, so where should I include this explanation?

@BenjaminBossan
Copy link
Member

That's indeed the case, so where should I include this explanation?

How about adding a sentence or two here: https://github.com/huggingface/peft/tree/main/examples/bone_finetuning#advanced-usage. It could also be a good idea to add a flag the args of the script to enable Bat. WDYT?

@JL-er
Copy link
Contributor Author

JL-er commented Nov 26, 2024

That's indeed the case, so where should I include this explanation?

How about adding a sentence or two here: https://github.com/huggingface/peft/tree/main/examples/bone_finetuning#advanced-usage. It could also be a good idea to add a flag the args of the script to enable Bat. WDYT?

Okay, I have added instructions to guide users on how to use Bat.

@BenjaminBossan
Copy link
Member

Thanks for this update @JL-er. Could you please merge the latest main branch of PEFT, as there is a fix there required for CI.

@JL-er
Copy link
Contributor Author

JL-er commented Nov 26, 2024

Thanks for this update @JL-er. Could you please merge the latest main branch of PEFT, as there is a fix there required for CI.

Completed.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan
Copy link
Member

@JL-er Thanks for merging. Running the tests leads to a lot of errors related to Bone, such as

FAILED tests/test_custom_models.py::PeftCustomModelTester::test_disable_adapters_with_merging_087_Vanilla_MLP_1_Bone - RuntimeError: shape '[9, 10, 5, 2]' is invalid for input of size 90

Could you please check what happened there?

@JL-er
Copy link
Contributor Author

JL-er commented Nov 26, 2024

@JL-er Thanks for merging. Running the tests leads to a lot of errors related to Bone, such as

FAILED tests/test_custom_models.py::PeftCustomModelTester::test_disable_adapters_with_merging_087_Vanilla_MLP_1_Bone - RuntimeError: shape '[9, 10, 5, 2]' is invalid for input of size 90

Could you please check what happened there?

fix all
image

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this update to the Bone method. These improvements should hopefully help with the adoption of this method.

Note that technically, this is a backwards incompatible change, since the underlying algorithm for Bone was changed. This is fine as Bone is not part of any PEFT release. But after the next PEFT release (could be next week), this type of change will no longer be possible (we can still add new variations of Bone, but the default one needs to stay the same).

@BenjaminBossan BenjaminBossan merged commit 60978d7 into huggingface:main Nov 27, 2024
14 checks passed
@JL-er
Copy link
Contributor Author

JL-er commented Nov 27, 2024

Thanks for this update to the Bone method. These improvements should hopefully help with the adoption of this method.

Note that technically, this is a backwards incompatible change, since the underlying algorithm for Bone was changed. This is fine as Bone is not part of any PEFT release. But after the next PEFT release (could be next week), this type of change will no longer be possible (we can still add new variations of Bone, but the default one needs to stay the same).

Thank you for your reminder. Currently, Bone is a very basic method that is better than LoRA in terms of speed and memory usage, so I will not make any modifications to it.

@JL-er
Copy link
Contributor Author

JL-er commented Dec 20, 2024

@BenjaminBossan My paper has undergone significant revisions. Can I update the citations and explanations?

@BenjaminBossan
Copy link
Member

@JL-er Yes sure, go ahead. Did anything on the implementation side change? This would be more difficult to amend now.

@JL-er
Copy link
Contributor Author

JL-er commented Dec 21, 2024

@JL-er Yes sure, go ahead. Did anything on the implementation side change? This would be more difficult to amend now.

The structure remains unchanged, only the paper section has been modified to provide more reasonable explanations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants