Synthetic preference dataset #332

vwxyzjn · 2024-09-05T20:35:40Z

Useful to reuse our existing rejection sampling infra and build on top of it to create synthetic preference dataset

# 1. first sample a bunch of completions given prompts
# Here is an example created dataset: https://huggingface.co/datasets/vwxyzjn/generation_1725567768
python open_instruct/rejection_sampling/generation.py \
    --dataset_name HuggingFaceH4/no_robots \
    --model_name_or_path allenai/llama-3-tulu-2-8b \
    --num_completions 3 \
    --save_filename output/completions.jsonl \
    --sanity_check \
    --push_to_hub

Create preference pairs

# 2.1 do LLM as a judge to create synthetic preference dataset
# Here is an example created dataset: https://huggingface.co/datasets/vwxyzjn/synthetic_preference_dataset_1725567862
python open_instruct/rejection_sampling/synthetic_preference_dataset.py \
    --input_filename output/completions.jsonl \
    --model gpt-4o-2024-08-06 \
    --save_filename output/synthetic_preferences.jsonl \
    --num_completions 3 \
    --push_to_hub \

Multiple prompt pairs as well.

ValentinaPy · 2024-09-05T20:37:29Z

Very cool! I can run it over the WildChat prompts!

…ce-dataset

nouhadziri · 2024-09-06T14:01:07Z

This is pretty cool Costa! thanks for adding this. LGTM

push changes

8934de7

vwxyzjn requested review from ValentinaPy, jacob-morrison, natolambert and nouhadziri September 5, 2024 20:35

vwxyzjn marked this pull request as draft September 5, 2024 20:35

vwxyzjn requested a review from ljvmiranda921 September 5, 2024 20:39

vwxyzjn changed the title ~~push changes~~ Synthetic preference dataset Sep 5, 2024

vwxyzjn added 3 commits September 5, 2024 20:41

Support hf revision when generating

5a200b6

style an quality

6b6d8fe

Merge branch 'support-revision-when-generate' into synthetic-preferen…

63d360b

…ce-dataset

vwxyzjn mentioned this pull request Sep 11, 2024

Add online trainers #204

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synthetic preference dataset #332

Synthetic preference dataset #332

vwxyzjn commented Sep 5, 2024

ValentinaPy commented Sep 5, 2024

nouhadziri commented Sep 6, 2024

Synthetic preference dataset #332

Are you sure you want to change the base?

Synthetic preference dataset #332

Conversation

vwxyzjn commented Sep 5, 2024

Create preference pairs

ValentinaPy commented Sep 5, 2024

nouhadziri commented Sep 6, 2024