support wandb log in dpo #568

akk-123 · 2023-07-25T08:00:52Z

dpo use self.log_metrics to log metric now, please support more friendly wandb log like PPOTrainer
dpo need load 2 model, can you support lora model, so that only load one model

The text was updated successfully, but these errors were encountered:

kashif · 2023-07-25T15:31:58Z

regarding point 2 @akk-123, the reference model is used in evaluation mode and the main model can certainly be prepared for training via Peft and that should work as per the other trainers using peft

akk-123 · 2023-07-26T02:38:30Z

@kashif thanks. I use lora to train main model, and set save_steps to save weight, I found that there saved too many things

how can I only save adapter_model.bin and adapter_config.json?

This was referenced Jul 25, 2023

DPOTrainer logging too frequent #569

Closed

[DPO] Resolve logging for DPOTrainer #570

Merged

lvwerra closed this as completed in #570 Jul 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support wandb log in dpo #568

support wandb log in dpo #568

akk-123 commented Jul 25, 2023 •

edited

Loading

kashif commented Jul 25, 2023

akk-123 commented Jul 26, 2023

support wandb log in dpo #568

support wandb log in dpo #568

Comments

akk-123 commented Jul 25, 2023 • edited Loading

kashif commented Jul 25, 2023

akk-123 commented Jul 26, 2023

akk-123 commented Jul 25, 2023 •

edited

Loading