You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to be able to skip the prepare_model_for_kbit_training call that occurs inside DPOTrainer.
Motivation
I noticed a significant speedup by skipping the prepare_model_for_kbit_training call in DPOTrainer without degrading training quality by properly setting the gradient checkpointing configurations, as mentioned in this and this comments.
Your contribution
I can open a PR that adds a flag to allow users to skip this call when desired.
The text was updated successfully, but these errors were encountered:
Feature request
I'd like to be able to skip the
prepare_model_for_kbit_training
call that occurs insideDPOTrainer
.Motivation
I noticed a significant speedup by skipping the
prepare_model_for_kbit_training
call inDPOTrainer
without degrading training quality by properly setting the gradient checkpointing configurations, as mentioned in this and this comments.Your contribution
I can open a PR that adds a flag to allow users to skip this call when desired.
The text was updated successfully, but these errors were encountered: