Update custom eval loop to aid DPO debugging #770

tomaarsen · 2023-09-14T10:04:11Z

Hello!

This is intended to be pushed directly on top of dpo_custom_eval i.e. on top of #766, but I don't have the permissions for that.

Pull Request overview

sample_during_eval is now generate_during_eval - I think sample is a bit too vague.
return_tokens was unused, so I removed it.
Prevent test failures due to wandb import without having wandb as a mandatory dependency. I added import utils for W&B & a test.
Optimize random batch selection.
Separate prompt and Policy/Reference responses in game log table.

This PR is a WIP.

Tom Aarsen

HuggingFaceDocBuilderDev · 2023-09-14T10:09:07Z

The documentation is not available anymore as the PR was closed or merged.

tomaarsen · 2023-09-14T10:11:47Z

Bad news @natolambert, the islice still seems to iterate over all elements until random_index.

tomaarsen · 2023-09-14T10:32:25Z

That's easy to resolve though. The new approach takes ~0.005 seconds regardless of which index is used.

This also doesn't restrict us to batches anymore. We can just go for 1 sample now, for example.

Makes it much easier to quickly read the starts of the generations

* init * run * Update custom eval loop to aid DPO debugging (#770) * sample_during_eval -> generate_during_eval * Remove unused return_tokens * Add import utils for W&B, prevent test fails * Optimize dataloader random batch selection * Separate prompt and response in logs Makes it much easier to quickly read the starts of the generations * Simplify logging * reset eval steps * manual merge fixes * revert merge * remove self.max_length * style * fix max_length --------- Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>

* init * run * Update custom eval loop to aid DPO debugging (huggingface#770) * sample_during_eval -> generate_during_eval * Remove unused return_tokens * Add import utils for W&B, prevent test fails * Optimize dataloader random batch selection * Separate prompt and response in logs Makes it much easier to quickly read the starts of the generations * Simplify logging * reset eval steps * manual merge fixes * revert merge * remove self.max_length * style * fix max_length --------- Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>

tomaarsen added 3 commits September 14, 2023 10:51

sample_during_eval -> generate_during_eval

e0e4a87

Remove unused return_tokens

891422b

Add import utils for W&B, prevent test fails

3b57daf

Optimize dataloader random batch selection

36acf85

tomaarsen added 2 commits September 14, 2023 13:43

Separate prompt and response in logs

63059ee

Makes it much easier to quickly read the starts of the generations

Simplify logging

40e6765

tomaarsen mentioned this pull request Sep 14, 2023

[FEATURE] Implement ArgillaTRLCallback to log generations while training TRL models into Argilla argilla-io/argilla#3770

Closed

natolambert merged commit d53b982 into huggingface:dpo_custom_eval Sep 14, 2023

tomaarsen deleted the dpo_custom_eval branch September 14, 2023 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update custom eval loop to aid DPO debugging #770

Update custom eval loop to aid DPO debugging #770

tomaarsen commented Sep 14, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 14, 2023 •

edited

Loading

tomaarsen commented Sep 14, 2023

tomaarsen commented Sep 14, 2023

Update custom eval loop to aid DPO debugging #770

Update custom eval loop to aid DPO debugging #770

Conversation

tomaarsen commented Sep 14, 2023 • edited Loading

Pull Request overview

HuggingFaceDocBuilderDev commented Sep 14, 2023 • edited Loading

tomaarsen commented Sep 14, 2023

tomaarsen commented Sep 14, 2023

tomaarsen commented Sep 14, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 14, 2023 •

edited

Loading