Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardise example scripts #842

Merged
merged 5 commits into from
Oct 11, 2023
Merged

Standardise example scripts #842

merged 5 commits into from
Oct 11, 2023

Conversation

lewtun
Copy link
Member

@lewtun lewtun commented Oct 7, 2023

This PR standardises all the example scripts to follow the run_xxx.py convention, where xxx typically refers to the algorithm instead of the task (i.e. have just 1 PPO example instead of calling it "sentiment tuning"). The resulting structure is as follows:

examples/scripts
├── run_ddpo.py
├── run_dpo.py
├── run_ppo.py
├── run_ppo_multi_adapter.py
├── run_reward_modeling.py
└── run_sft.py

IMO this makes it a bit easier for newcomers to know what each script does by filename instead of guessing whether e.g. multi adapter RL refers to PPO or something else.

I also deleted an old and duplicate multi adapter RL script multi_adapter_rl.py which seems to be outdated.

Eventually, we could harmonize the scripts so that the SFT and reward models produced by run_sft.py and run_reward_modeling.py are the same ones that feed into run_ppo.py and run_dpo.py. This would give a true end to end pipeline that is maintained & solid for many people to work from :)

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 7, 2023

The documentation is not available anymore as the PR was closed or merged.

@vwxyzjn
Copy link
Contributor

vwxyzjn commented Oct 9, 2023

/benchmark-trl-experiments benchmark/benchmark_level1.sh

@github-actions
Copy link

github-actions bot commented Oct 9, 2023

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6456929199

Copy link
Contributor

@vwxyzjn vwxyzjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the standardization! Very nice change. I assume multi_adapter_rl.py is deprecated in favor of multi_adapter_rl_v2.py (the now run_ppo_multi_adapter.py)?

@lewtun
Copy link
Member Author

lewtun commented Oct 9, 2023

Love the standardization! Very nice change. I assume multi_adapter_rl.py is deprecated in favor of multi_adapter_rl_v2.py (the now run_ppo_multi_adapter.py)?

Yes, that's correct!

@vwxyzjn
Copy link
Contributor

vwxyzjn commented Oct 9, 2023

/benchmark-trl-experiments benchmark/benchmark_level1.sh

@github-actions
Copy link

github-actions bot commented Oct 9, 2023

Benchmark on Comment: failed ❌
https://github.com/huggingface/trl/actions/runs/6458185139

@vwxyzjn
Copy link
Contributor

vwxyzjn commented Oct 9, 2023

/benchmark-trl-experiments benchmark/benchmark_level1.sh

@github-actions
Copy link

github-actions bot commented Oct 9, 2023

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6458212548

@vwxyzjn
Copy link
Contributor

vwxyzjn commented Oct 9, 2023

[COSTA BENCHMARK BOT]: Here are the results
hello_world.png
hello_world-time.png

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks great, thanks! Small nit: I don't like the run_xxx.py naming that much, I think just xxx.py would do the job and be less redundant.

@lewtun
Copy link
Member Author

lewtun commented Oct 11, 2023

Generally looks great, thanks! Small nit: I don't like the run_xxx.py naming that much, I think just xxx.py would do the job and be less redundant.

Good idea! Done in a6d1d90

I'll merge if all the tests still pass

@vwxyzjn
Copy link
Contributor

vwxyzjn commented Oct 11, 2023

LG!

@lewtun lewtun merged commit ddd3188 into main Oct 11, 2023
@lewtun lewtun deleted the order-scripts branch October 11, 2023 15:28
abhilash1910 pushed a commit to abhilash1910/trl that referenced this pull request Oct 12, 2023
* Standardise example scripts

* fix plotting script

* Rename run_xxx to xxx

* Fix doc

---------

Co-authored-by: Costa Huang <costa.huang@outlook.com>
pcuenca pushed a commit to huggingface/blog that referenced this pull request Oct 20, 2023
lvwerra added a commit that referenced this pull request Oct 31, 2023
* enable xpu support

* fix bug

* review commits

* fix style

* add xou decorator

* refactor review commit

* fix test

* review commit

* fix test

* Update benchmark.yml (#856)

* Standardise example scripts (#842)

* Standardise example scripts

* fix plotting script

* Rename run_xxx to xxx

* Fix doc

---------

Co-authored-by: Costa Huang <costa.huang@outlook.com>

* Fix version check in import_utils.py (#853)

* dont use get_peft_model if model is already peft (#857)

* merge conflict

* add xou decorator

* resolve

* resolves

* upstream

* refactor and precommit

* fix new tests

* add device mapping for xpu

---------

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: Costa Huang <costa.huang@outlook.com>
Co-authored-by: Adam Pauls <adpauls@gmail.com>
Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>
lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024
* Standardise example scripts

* fix plotting script

* Rename run_xxx to xxx

* Fix doc

---------

Co-authored-by: Costa Huang <costa.huang@outlook.com>
lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024
* enable xpu support

* fix bug

* review commits

* fix style

* add xou decorator

* refactor review commit

* fix test

* review commit

* fix test

* Update benchmark.yml (huggingface#856)

* Standardise example scripts (huggingface#842)

* Standardise example scripts

* fix plotting script

* Rename run_xxx to xxx

* Fix doc

---------

Co-authored-by: Costa Huang <costa.huang@outlook.com>

* Fix version check in import_utils.py (huggingface#853)

* dont use get_peft_model if model is already peft (huggingface#857)

* merge conflict

* add xou decorator

* resolve

* resolves

* upstream

* refactor and precommit

* fix new tests

* add device mapping for xpu

---------

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: Costa Huang <costa.huang@outlook.com>
Co-authored-by: Adam Pauls <adpauls@gmail.com>
Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants