Adding generic preference dataset builder #1623

SalmanMohammadi · 2024-09-19T16:38:22Z

Context

What is the purpose of this PR? Is it to

add a new feature
fix a bug
update tests and/or documentation
other (please add here)

I was writing the docs for the preference datasets but noticed we don't have a generic builder to point to, so, here we are.

Changelog

What are the changes made in this PR?

I've added a preference_dataset builder which uses ChosenToRejectedMessages. I don't think it's worth adding the stack exchange message format as an option at this time.
I've also exposed the HH RLHF dataset builder in our docs.

Test plan

Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.

run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
add unit tests for any new functionality
update docstrings for any new or updated methods or classes
run unit tests via pytest tests
run recipe tests via pytest tests -m integration_test
manually run any new or modified recipes with sufficient proof of correctness
include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

UX

If your function changed a public API, please add a dummy example of what the user experience will look like when calling it.
Here is a docstring example
and a tutorial example

I did not change any public API
I have added an example to docs or docstrings

pytorch-bot · 2024-09-19T16:38:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1623

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit adb8934 with merge base c5db813 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

RdoubleA

do you want to also enable packing for preference datasets? should be easy to test and there's nothing blocking it

RdoubleA · 2024-09-19T16:39:55Z

tests/torchtune/datasets/test_preference_dataset.py

@@ -111,3 +112,46 @@ def test_get_item(self, mock_load_dataset, dialogue, expected):
        prompt, label = ds[0]["rejected_input_ids"], ds[0]["rejected_labels"]
        assert prompt == expected_rejected_tokens
        assert label == expected_rejected_labels
+
+    def test_load_local_json(self):


nice, we hoenstly need more of these local dataset tests

RdoubleA · 2024-09-19T16:42:34Z

torchtune/datasets/_preference.py

+                        "content": "What do I do when I have a hole in my trousers?",
+                        "role": "user"
+                    },
+                    { "content": "Take them off.", "role": "assistant" }


🫢 🫢 🫢

RdoubleA · 2024-09-19T16:43:14Z

torchtune/datasets/_preference.py

+    ...     split="train",
+    >>> )
+    >>> tokenizer.decode(dataset[0]["chosen_input_ids"], skip_special_tokens=True)
+    What do I do when I have a hole in my trousers?Fix the hole.


We should fix this trim whitespace issue at some point...

actually renders for me as 'user\n\nWhat do I do when I have a hole in my trousers?assistant\n\nTake them off.' with llama3 tokenizer and skip_special_tokens=True.

SalmanMohammadi · 2024-09-19T16:50:47Z

do you want to also enable packing for preference datasets? should be easy to test and there's nothing blocking it

I'm actually pretty out of the loop on packed datasets, sorry. TLDR how to?

RdoubleA · 2024-09-19T17:10:45Z

I'm actually pretty out of the loop on packed datasets, sorry. TLDR how to?

just emulate the logic here and that's all you need, then expose the packed parameter:

torchtune/torchtune/datasets/_instruct.py

Line 257 in c5db813

if packed:

then just run a DPO config with preference_dataset and use the override dataset.packed=True tokenizer.max_seq_len=4096 and see if it packs successfully and starts training with a few reasonable loss steps. You can hardcode num packs to make it pack faster

You could do this in a follow-up, but would be a nice boost to our RLHF recipes

SalmanMohammadi · 2024-09-19T17:13:21Z

I'm actually pretty out of the loop on packed datasets, sorry. TLDR how to?

just emulate the logic here and that's all you need, then expose the packed parameter:

torchtune/torchtune/datasets/_instruct.py

Line 257 in c5db813

if packed:

then just run a DPO config with preference_dataset and use the override dataset.packed=True tokenizer.max_seq_len=4096 and see if it packs successfully and starts training with a few reasonable loss steps. You can hardcode num packs to make it pack faster

You could do this in a follow-up, but would be a nice boost to our RLHF recipes

Looks suspiciously straightforward - will follow up with this.

adding generic pref builder

2687238

SalmanMohammadi requested a review from RdoubleA September 19, 2024 16:38

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 19, 2024

RdoubleA approved these changes Sep 19, 2024

View reviewed changes

udpating example

a2ba6a5

being bullied by sphinx

adb8934

SalmanMohammadi merged commit cd573f9 into pytorch:main Sep 19, 2024
17 checks passed

SalmanMohammadi deleted the preference_dataset_builder branch September 19, 2024 17:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding generic preference dataset builder #1623

Adding generic preference dataset builder #1623

SalmanMohammadi commented Sep 19, 2024

pytorch-bot bot commented Sep 19, 2024 •

edited

Loading

RdoubleA left a comment

RdoubleA Sep 19, 2024

RdoubleA Sep 19, 2024

RdoubleA Sep 19, 2024

SalmanMohammadi Sep 19, 2024 •

edited

Loading

SalmanMohammadi commented Sep 19, 2024

RdoubleA commented Sep 19, 2024 •

edited

Loading

SalmanMohammadi commented Sep 19, 2024

Adding generic preference dataset builder #1623

Adding generic preference dataset builder #1623

Conversation

SalmanMohammadi commented Sep 19, 2024

Context

Changelog

Test plan

UX

pytorch-bot bot commented Sep 19, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1623

✅ No Failures

RdoubleA left a comment

Choose a reason for hiding this comment

RdoubleA Sep 19, 2024

Choose a reason for hiding this comment

RdoubleA Sep 19, 2024

Choose a reason for hiding this comment

RdoubleA Sep 19, 2024

Choose a reason for hiding this comment

SalmanMohammadi Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

SalmanMohammadi commented Sep 19, 2024

RdoubleA commented Sep 19, 2024 • edited Loading

SalmanMohammadi commented Sep 19, 2024

pytorch-bot bot commented Sep 19, 2024 •

edited

Loading

SalmanMohammadi Sep 19, 2024 •

edited

Loading

RdoubleA commented Sep 19, 2024 •

edited

Loading