Standardise example scripts #842

lewtun · 2023-10-07T13:35:40Z

This PR standardises all the example scripts to follow the run_xxx.py convention, where xxx typically refers to the algorithm instead of the task (i.e. have just 1 PPO example instead of calling it "sentiment tuning"). The resulting structure is as follows:

examples/scripts
├── run_ddpo.py
├── run_dpo.py
├── run_ppo.py
├── run_ppo_multi_adapter.py
├── run_reward_modeling.py
└── run_sft.py

IMO this makes it a bit easier for newcomers to know what each script does by filename instead of guessing whether e.g. multi adapter RL refers to PPO or something else.

I also deleted an old and duplicate multi adapter RL script multi_adapter_rl.py which seems to be outdated.

Eventually, we could harmonize the scripts so that the SFT and reward models produced by run_sft.py and run_reward_modeling.py are the same ones that feed into run_ppo.py and run_dpo.py. This would give a true end to end pipeline that is maintained & solid for many people to work from :)

HuggingFaceDocBuilderDev · 2023-10-07T13:41:00Z

The documentation is not available anymore as the PR was closed or merged.

vwxyzjn · 2023-10-09T12:56:24Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

github-actions · 2023-10-09T12:59:00Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6456929199

vwxyzjn

Love the standardization! Very nice change. I assume multi_adapter_rl.py is deprecated in favor of multi_adapter_rl_v2.py (the now run_ppo_multi_adapter.py)?

lewtun · 2023-10-09T13:35:52Z

Love the standardization! Very nice change. I assume multi_adapter_rl.py is deprecated in favor of multi_adapter_rl_v2.py (the now run_ppo_multi_adapter.py)?

Yes, that's correct!

vwxyzjn · 2023-10-09T14:43:38Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

github-actions · 2023-10-09T14:44:57Z

Benchmark on Comment: failed ❌
https://github.com/huggingface/trl/actions/runs/6458185139

vwxyzjn · 2023-10-09T14:46:15Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

github-actions · 2023-10-09T14:47:02Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6458212548

vwxyzjn · 2023-10-09T15:37:17Z

[COSTA BENCHMARK BOT]: Here are the results

lvwerra

Generally looks great, thanks! Small nit: I don't like the run_xxx.py naming that much, I think just xxx.py would do the job and be less redundant.

lewtun · 2023-10-11T13:54:06Z

Generally looks great, thanks! Small nit: I don't like the run_xxx.py naming that much, I think just xxx.py would do the job and be less redundant.

Good idea! Done in a6d1d90

I'll merge if all the tests still pass

docs/source/sentiment_tuning.mdx

vwxyzjn · 2023-10-11T15:11:35Z

LG!

* Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>

* enable xpu support * fix bug * review commits * fix style * add xou decorator * refactor review commit * fix test * review commit * fix test * Update benchmark.yml (#856) * Standardise example scripts (#842) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com> * Fix version check in import_utils.py (#853) * dont use get_peft_model if model is already peft (#857) * merge conflict * add xou decorator * resolve * resolves * upstream * refactor and precommit * fix new tests * add device mapping for xpu --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Costa Huang <costa.huang@outlook.com> Co-authored-by: Adam Pauls <adpauls@gmail.com> Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>

* Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>

* enable xpu support * fix bug * review commits * fix style * add xou decorator * refactor review commit * fix test * review commit * fix test * Update benchmark.yml (huggingface#856) * Standardise example scripts (huggingface#842) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com> * Fix version check in import_utils.py (huggingface#853) * dont use get_peft_model if model is already peft (huggingface#857) * merge conflict * add xou decorator * resolve * resolves * upstream * refactor and precommit * fix new tests * add device mapping for xpu --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Costa Huang <costa.huang@outlook.com> Co-authored-by: Adam Pauls <adpauls@gmail.com> Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>

Standardise example scripts

17f6266

lewtun requested review from vwxyzjn, lvwerra and younesbelkada October 9, 2023 13:04

vwxyzjn reviewed Oct 9, 2023

View reviewed changes

Merge branch 'main' into order-scripts

f9fb3be

vwxyzjn approved these changes Oct 9, 2023

View reviewed changes

fix plotting script

b705d7d

lvwerra approved these changes Oct 10, 2023

View reviewed changes

Rename run_xxx to xxx

a6d1d90

lewtun commented Oct 11, 2023

View reviewed changes

docs/source/sentiment_tuning.mdx Outdated Show resolved Hide resolved

Fix doc

1083a9f

lewtun merged commit ddd3188 into main Oct 11, 2023

lewtun deleted the order-scripts branch October 11, 2023 15:28

neo mentioned this pull request Oct 11, 2023

Update example script link for ddpo huggingface/blog#1573

Merged

pcuenca pushed a commit to huggingface/blog that referenced this pull request Oct 20, 2023

Update example script link for ddpo (huggingface/trl#842) (#1573)

18b4484

lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024

Standardise example scripts (huggingface#842)

ffd35ff

* Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardise example scripts #842

Standardise example scripts #842

lewtun commented Oct 7, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 7, 2023 •

edited

Loading

vwxyzjn commented Oct 9, 2023

github-actions bot commented Oct 9, 2023

vwxyzjn left a comment

lewtun commented Oct 9, 2023

vwxyzjn commented Oct 9, 2023

github-actions bot commented Oct 9, 2023

vwxyzjn commented Oct 9, 2023

github-actions bot commented Oct 9, 2023

vwxyzjn commented Oct 9, 2023

lvwerra left a comment

lewtun commented Oct 11, 2023

vwxyzjn commented Oct 11, 2023

Standardise example scripts #842

Standardise example scripts #842

Conversation

lewtun commented Oct 7, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Oct 7, 2023 • edited Loading

vwxyzjn commented Oct 9, 2023

github-actions bot commented Oct 9, 2023

vwxyzjn left a comment

Choose a reason for hiding this comment

lewtun commented Oct 9, 2023

vwxyzjn commented Oct 9, 2023

github-actions bot commented Oct 9, 2023

vwxyzjn commented Oct 9, 2023

github-actions bot commented Oct 9, 2023

vwxyzjn commented Oct 9, 2023

lvwerra left a comment

Choose a reason for hiding this comment

lewtun commented Oct 11, 2023

vwxyzjn commented Oct 11, 2023

lewtun commented Oct 7, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 7, 2023 •

edited

Loading