Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deepspeed experiment #795

Merged
merged 5 commits into from
Sep 20, 2023
Merged

Add deepspeed experiment #795

merged 5 commits into from
Sep 20, 2023

Conversation

vwxyzjn
Copy link
Contributor

@vwxyzjn vwxyzjn commented Sep 19, 2023

No description provided.

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Sep 19, 2023

/benchmark-trl-experiments benchmark/benchmark_level1.sh

@github-actions
Copy link

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239403115

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 19, 2023

The documentation is not available anymore as the PR was closed or merged.

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Sep 19, 2023

/benchmark-trl-experiments benchmark/benchmark_level2.sh

@github-actions
Copy link

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239487870

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Sep 19, 2023

/benchmark-trl-experiments benchmark/benchmark_level1.sh

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Sep 19, 2023

/benchmark-trl-experiments benchmark/benchmark_level2.sh

@github-actions
Copy link

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239645721

@github-actions
Copy link

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239646225

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Sep 19, 2023

[COSTA BENCHMARK BOT]: Here are the results
different_models.png
different_models-time.png

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Sep 19, 2023

[COSTA BENCHMARK BOT]: Here are the results
different_models.png
deepspeed-time.png
different_models-time.png
deepspeed.png

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Sep 20, 2023

Cerebras results are expected — it's training against a random reward model, so it's reward learning curve should be more chaotic.

@vwxyzjn vwxyzjn requested a review from lewtun September 20, 2023 13:16
Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding this sweet benchmark 🚀 ! I left a comment about adding a benchmark for ZeRO-3 but that can also be a separate PR if you prefer

@@ -1,4 +1,4 @@
# compound
# compound: gpt2xl + grad_accu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding, is this compound arg documented somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compound comment simply means we are using more features at once (e.g., in this case, we are using a larger model and gradiant accumulation at the same time) :)

# compound: Cerebras-GPT-6.7B + deepspeed zero2 + grad_accu
python benchmark/benchmark.py \
--command "accelerate launch --config_file examples/accelerate_configs/deepspeed_zero2.yaml examples/scripts/sentiment_tuning.py --ppo_config.exp_name sentiment_tuning_Cerebras-GPT-6.7B_grad_accu_deepspeed_stage2 --ppo_config.batch_size 32 --ppo_config.mini_batch_size 32 --ppo_config.log_with wandb --ppo_config.model_name cerebras/Cerebras-GPT-6.7B --ppo_config.reward_model sentiment-analysis:cerebras/Cerebras-GPT-6.7B" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I think we should do the "proper" thing and fine-tune these models on IMDB so we have a genuine good policy / reward model. Of course, not necessary for this PR, but perhaps good to be as realistic as possible for the benchmark

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that sounds good. Perhaps we can set up an end-to-end example where we train the reward model and then the policy model at the same time.

--slurm-template-path benchmark/trl.slurm_template

# compound: Cerebras-GPT-6.7B + deepspeed zero2 + grad_accu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also benchmark ZeRO-3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's probably do this in a separate PR.

@vwxyzjn vwxyzjn merged commit b8f0c4c into huggingface:main Sep 20, 2023
lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024
* Add deepspeed experiment

* add deepspeed pip install

* update hello world.sh

* update comments

* remove cleanup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants