-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new version Bone #2233
new version Bone #2233
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, addressing some of the shortcomings of the initial Bone implementation. Just to be clear: What was previously called "Bone" now corresponds to "Bat" and the current "Bone" is something different (but similar). Am I right in my understanding that the new Bone is more memory efficient and faster than Bat but Bat performs better (tables 4 and 7)?
I added a few smaller comments to the PR. On top of that, let's ensure that both variants are covered by testing. For this, I think we need to add new rows with init_weights="bat"
here:
peft/tests/test_custom_models.py
Lines 301 to 304 in eaaf03c
("Vanilla MLP 1 Bone", "MLP", BoneConfig, {"target_modules": "lin0", "r": 2}), | |
("Vanilla MLP 2 Bone", "MLP", BoneConfig, {"target_modules": ["lin0"], "r": 2}), | |
("Vanilla MLP 3 Bone", "MLP", BoneConfig, {"target_modules": ["lin0", "lin1"], "r": 2}), | |
("Vanilla MLP 5 Bone", "MLP", BoneConfig, {"target_modules": ["lin0"], "modules_to_save": ["lin1"], "r": 2}), |
Finally, WDYT about updating the Bone finetuning example to explicitly show results for Bone and Bat?
Yes, your understanding is correct. Bone is now more like a foundational method comparable to Lora. Bat is more like an improved method compared to Pissa. However, Bone itself has already surpassed Pissa. |
I don't quite understand this issue. Currently, Bat is slower, so I think people would prefer to use Bone for training or research. |
That's true, but Bat has slightly better scores, right? So some users might be willing to trade memory+speed for better scores. If you think it's not really worth it, at least the example could mention that this is something that users can try out? |
That's indeed the case, so where should I include this explanation? |
How about adding a sentence or two here: https://github.com/huggingface/peft/tree/main/examples/bone_finetuning#advanced-usage. It could also be a good idea to add a flag the args of the script to enable Bat. WDYT? |
Okay, I have added instructions to guide users on how to use Bat. |
Thanks for this update @JL-er. Could you please merge the latest |
Completed. |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@JL-er Thanks for merging. Running the tests leads to a lot of errors related to Bone, such as
Could you please check what happened there? |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this update to the Bone method. These improvements should hopefully help with the adoption of this method.
Note that technically, this is a backwards incompatible change, since the underlying algorithm for Bone was changed. This is fine as Bone is not part of any PEFT release. But after the next PEFT release (could be next week), this type of change will no longer be possible (we can still add new variations of Bone, but the default one needs to stay the same).
Thank you for your reminder. Currently, Bone is a very basic method that is better than LoRA in terms of speed and memory usage, so I will not make any modifications to it. |
@BenjaminBossan My paper has undergone significant revisions. Can I update the citations and explanations? |
@JL-er Yes sure, go ahead. Did anything on the implementation side change? This would be more difficult to amend now. |
The structure remains unchanged, only the paper section has been modified to provide more reasonable explanations. |
https://arxiv.org/abs/2409.15371
Bone has been updated to a new version. The current Bone is faster, more memory-efficient, and performs better than the Lora series, making it a foundational structure comparable to Lora. The previous version of Bone has now been renamed to Bat, and users can use it by including "bat" in init_weight.