Add the possibility to skip prepare_model_for_kbit_training #2459

hugoabonizio · 2024-12-10T19:38:08Z

Feature request

I'd like to be able to skip the prepare_model_for_kbit_training call that occurs inside DPOTrainer.

Motivation

I noticed a significant speedup by skipping the prepare_model_for_kbit_training call in DPOTrainer without degrading training quality by properly setting the gradient checkpointing configurations, as mentioned in this and this comments.

Your contribution

I can open a PR that adds a flag to allow users to skip this call when desired.

The text was updated successfully, but these errors were encountered:

qgallouedec · 2024-12-13T22:16:30Z

Thanks for this suggestion.
Can you quantify the speedup?
Any idea how to properly set the gradient checkpointing configurations?

Can we reproduce the speedup with a very simple code example?

qgallouedec added ✨ enhancement New feature or request 🏋 DPO Related to DPO labels Dec 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the possibility to skip prepare_model_for_kbit_training #2459

Add the possibility to skip prepare_model_for_kbit_training #2459

hugoabonizio commented Dec 10, 2024

qgallouedec commented Dec 13, 2024

Add the possibility to skip prepare_model_for_kbit_training #2459

Add the possibility to skip prepare_model_for_kbit_training #2459

Comments

hugoabonizio commented Dec 10, 2024

Feature request

Motivation

Your contribution

qgallouedec commented Dec 13, 2024