Skip to content

RLooTrainer bug when using deepspeed #2329

@macheng6

Description

@macheng6

System Info

When using DeepSpeed, the RLOOTrainer reports an error: "ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any functionality from the accelerate library." This is likely due to the accelerate not being properly initialized in line 120 of the RLOOTrainer code, possibly because the deepspeed_plugin was not passed in.

trl: 0.12.0.dev0
transformers: 4.45.2
accelerate: 1.0.1

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

accelerate launch --config_file examples/accelerate_configs/deepspeed_zero3.yaml
examples/scripts/rloo/rloo.py
--dataset_name trl-internal-testing/descriptiveness-sentiment-trl-style
--dataset_train_split descriptiveness
--output_dir models/minimal/rloo
--rloo_k 2
--num_ppo_epochs 1
--num_mini_batches 1
--learning_rate 3e-6
--per_device_train_batch_size 1
--gradient_accumulation_steps 16
--total_episodes 10000
--model_name_or_path EleutherAI/pythia-1b-deduped
--sft_model_path EleutherAI/pythia-1b-deduped
--reward_model_path EleutherAI/pythia-1b-deduped
--local_rollout_forward_batch_size 1
--missing_eos_penalty 1.0

Expected behavior

This is likely due to the accelerate not being properly initialized in line 120 of the RLOOTrainer code, possibly because the deepspeed_plugin was not passed in.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions