Conversational dataset support for `KTOTrainer` #2248

qgallouedec · 2024-10-18T14:18:22Z

What does this PR do?

Part of #2071
Fixes #2238

This PR also adds support for paired preference dataset for KTO.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2024-10-18T14:23:02Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

lewtun

Great refactor @qgallouedec ! LGTM with a tiny nit

docs/source/kto_trainer.mdx

lewtun · 2024-10-24T09:36:48Z

examples/scripts/kto.py

@@ -95,24 +93,6 @@
    # Load the dataset
    dataset = load_dataset(script_args.dataset_name)

-    # If needed, reformat a DPO-formatted dataset (prompt, chosen, rejected) to a KTO-format (prompt, completion, label)


Nice refactor!

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

qgallouedec and others added 7 commits October 18, 2024 09:01

get_batch_sample -> generate_from_model[_and_ref]

979c5c5

add num_items_in_batch=None

ada53cf

num_items_in_batch in training_step

10bffa0

Fix return type hint

ca2d98f

desc for unpair dataset util

0ecc5fb

update example

2f60dd0

process in KTO

c53d0fc

This was referenced Oct 18, 2024

[Tracking issue] General dataset support #2071

Open

examples/scripts/kto.py does not work #2238

Closed

qgallouedec added 4 commits October 18, 2024 14:29

Update doc

27f483b

KTO doc rewrite

7a63418

fix orpo doc

b9f9ce2

add other dataset config names in test

3bfcd4b

qgallouedec marked this pull request as ready for review October 18, 2024 16:28

qgallouedec added 3 commits October 18, 2024 16:34

update doc image

9e977af

fix links in doc

97702d3

Update reward and log probability metrics in KTOTrainer doc

531441a

qgallouedec changed the base branch from main to rename_get_batch_sample October 18, 2024 16:43

skip enc-dec test

50e8e97

Base automatically changed from rename_get_batch_sample to main October 18, 2024 19:02

Merge branch 'main' into kto-conv-data-support

c829b07

qgallouedec requested review from kashif, edbeeching and lewtun October 21, 2024 07:26

Merge branch 'main' into kto-conv-data-support

249c3bb

lewtun approved these changes Oct 24, 2024

View reviewed changes

qgallouedec and others added 2 commits October 24, 2024 11:39

Update docs/source/kto_trainer.mdx

58f64b8

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

Merge branch 'main' into kto-conv-data-support

aa960f8

qgallouedec merged commit 1699473 into main Oct 24, 2024
3 of 10 checks passed

qgallouedec deleted the kto-conv-data-support branch October 24, 2024 12:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversational dataset support for `KTOTrainer` #2248

Conversational dataset support for `KTOTrainer` #2248

qgallouedec commented Oct 18, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 18, 2024

lewtun left a comment

lewtun Oct 24, 2024

Conversational dataset support for KTOTrainer #2248

Conversational dataset support for KTOTrainer #2248

Conversation

qgallouedec commented Oct 18, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Oct 18, 2024

lewtun left a comment

Choose a reason for hiding this comment

lewtun Oct 24, 2024

Choose a reason for hiding this comment

Conversational dataset support for `KTOTrainer` #2248

Conversational dataset support for `KTOTrainer` #2248

qgallouedec commented Oct 18, 2024 •

edited

Loading