Add deepspeed experiment #795

vwxyzjn · 2023-09-19T18:16:14Z

No description provided.

vwxyzjn · 2023-09-19T18:16:37Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

github-actions · 2023-09-19T18:17:17Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239403115

HuggingFaceDocBuilderDev · 2023-09-19T18:20:48Z

The documentation is not available anymore as the PR was closed or merged.

vwxyzjn · 2023-09-19T18:25:10Z

/benchmark-trl-experiments benchmark/benchmark_level2.sh

github-actions · 2023-09-19T18:25:42Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239487870

vwxyzjn · 2023-09-19T18:41:50Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

vwxyzjn · 2023-09-19T18:41:56Z

/benchmark-trl-experiments benchmark/benchmark_level2.sh

github-actions · 2023-09-19T18:42:47Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239645721

github-actions · 2023-09-19T18:43:23Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239646225

vwxyzjn · 2023-09-19T19:35:48Z

[COSTA BENCHMARK BOT]: Here are the results

vwxyzjn · 2023-09-19T22:19:46Z

[COSTA BENCHMARK BOT]: Here are the results

vwxyzjn · 2023-09-20T13:16:13Z

Cerebras results are expected — it's training against a random reward model, so it's reward learning curve should be more chaotic.

lewtun

Thanks a lot for adding this sweet benchmark 🚀 ! I left a comment about adding a benchmark for ZeRO-3 but that can also be a separate PR if you prefer

lewtun · 2023-09-20T13:17:36Z

benchmark/benchmark_level2.sh

@@ -1,4 +1,4 @@
-# compound
+# compound: gpt2xl + grad_accu


For my own understanding, is this compound arg documented somewhere?

The compound comment simply means we are using more features at once (e.g., in this case, we are using a larger model and gradiant accumulation at the same time) :)

lewtun · 2023-09-20T13:19:44Z

benchmark/benchmark_level2.sh

+# compound: Cerebras-GPT-6.7B + deepspeed zero2 + grad_accu
+python benchmark/benchmark.py \
+    --command "accelerate launch --config_file examples/accelerate_configs/deepspeed_zero2.yaml examples/scripts/sentiment_tuning.py --ppo_config.exp_name sentiment_tuning_Cerebras-GPT-6.7B_grad_accu_deepspeed_stage2  --ppo_config.batch_size 32  --ppo_config.mini_batch_size 32 --ppo_config.log_with wandb --ppo_config.model_name cerebras/Cerebras-GPT-6.7B --ppo_config.reward_model sentiment-analysis:cerebras/Cerebras-GPT-6.7B" \


Eventually I think we should do the "proper" thing and fine-tune these models on IMDB so we have a genuine good policy / reward model. Of course, not necessary for this PR, but perhaps good to be as realistic as possible for the benchmark

I think that sounds good. Perhaps we can set up an end-to-end example where we train the reward model and then the policy model at the same time.

lewtun · 2023-09-20T13:20:35Z

benchmark/benchmark_level2.sh

+    --slurm-template-path benchmark/trl.slurm_template
+
+# compound: Cerebras-GPT-6.7B + deepspeed zero2 + grad_accu


Should we also benchmark ZeRO-3?

Let's probably do this in a separate PR.

* Add deepspeed experiment * add deepspeed pip install * update hello world.sh * update comments * remove cleanup

Add deepspeed experiment

ad76d65

add deepspeed pip install

8b0af81

update hello world.sh

1632baf

vwxyzjn requested a review from lewtun September 20, 2023 13:16

lewtun approved these changes Sep 20, 2023

View reviewed changes

vwxyzjn added 2 commits September 20, 2023 13:26

update comments

48ab7c9

remove cleanup

5aaeda3

vwxyzjn merged commit b8f0c4c into huggingface:main Sep 20, 2023

lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024

Add deepspeed experiment (huggingface#795)

b245af2

* Add deepspeed experiment * add deepspeed pip install * update hello world.sh * update comments * remove cleanup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add deepspeed experiment #795

Add deepspeed experiment #795

vwxyzjn commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

github-actions bot commented Sep 19, 2023

HuggingFaceDocBuilderDev commented Sep 19, 2023 •

edited

Loading

vwxyzjn commented Sep 19, 2023

github-actions bot commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

github-actions bot commented Sep 19, 2023

github-actions bot commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

vwxyzjn commented Sep 20, 2023

lewtun left a comment

lewtun Sep 20, 2023

vwxyzjn Sep 20, 2023

lewtun Sep 20, 2023

vwxyzjn Sep 20, 2023

lewtun Sep 20, 2023

vwxyzjn Sep 20, 2023

		--slurm-template-path benchmark/trl.slurm_template

		# compound: Cerebras-GPT-6.7B + deepspeed zero2 + grad_accu

Add deepspeed experiment #795

Add deepspeed experiment #795

Conversation

vwxyzjn commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

github-actions bot commented Sep 19, 2023

HuggingFaceDocBuilderDev commented Sep 19, 2023 • edited Loading

vwxyzjn commented Sep 19, 2023

github-actions bot commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

github-actions bot commented Sep 19, 2023

github-actions bot commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

vwxyzjn commented Sep 19, 2023

vwxyzjn commented Sep 20, 2023

lewtun left a comment

Choose a reason for hiding this comment

lewtun Sep 20, 2023

Choose a reason for hiding this comment

vwxyzjn Sep 20, 2023

Choose a reason for hiding this comment

lewtun Sep 20, 2023

Choose a reason for hiding this comment

vwxyzjn Sep 20, 2023

Choose a reason for hiding this comment

lewtun Sep 20, 2023

Choose a reason for hiding this comment

vwxyzjn Sep 20, 2023

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Sep 19, 2023 •

edited

Loading