You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If precompute_train_ref_log_probs=True, then ref_model=None. If ref_model=None, then the model is assumed to be PEFT (even if it is not) on this line. Therefore, the .disable_adapter() fails, because the model is not a PEFT model.
Turns out this was a mistake on my side, sorry for the ping! The implementation is actually correct (we were just dropping the logp columns). Closing this out.
This line was not properly changed when #885 added the
precompute_train_ref_log_probs
option:https://github.com/huggingface/trl/blob/8f5b4923c8caca1f352581eb6f2fda583517b1a6/trl/trainer/dpo_trainer.py#L935C41-L935C41
If
precompute_train_ref_log_probs=True
, thenref_model=None
. Ifref_model=None
, then the model is assumed to be PEFT (even if it is not) on this line. Therefore, the.disable_adapter()
fails, because the model is not a PEFT model.Tagging: @kashif @lvwerra @younesbelkada
The text was updated successfully, but these errors were encountered: