Introduce TP on top of vLLM colocation #3393

toslali-ibm · 2025-04-29T18:54:17Z

What does this PR do?

Incorporating Quentin's feedback on TP w/ vLLM colocation.

just modify grpo trainer and don't touch vLLM client.

Run it w/ the following VLLM_USE_V1=0 ACCELERATE_LOG_LEVEL=info CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch --config_file recipes/accelerate_configs/zero3.yaml --num_processes=8 -m open_r1.grpo --config config_tpcoloc.yaml

Click to view config.yaml

# Model arguments
model_name_or_path: Qwen/Qwen2.5-Math-1.5B
model_revision: main
torch_dtype: bfloat16
attn_implementation: flash_attention_2

# Data training arguments
dataset_name: DigitalLearningGmbH/MATH-lighteval
dataset_config: default
dataset_prompt_column: problem
system_prompt: "You are a helpful AI Assistant, designed to provided well-reasoned and detailed responses. You FIRST think about the reasoning process as an internal monologue and then provide the user with the answer. The reasoning process MUST BE enclosed within <think> and </think> tags."

# GRPO trainer config
bf16: true
use_vllm: true
vllm_colocation: 2
vllm_gpu_memory_utilization: 0.3
vllm_max_model_len: 2048
do_eval: false
gradient_accumulation_steps: 1
gradient_checkpointing: true
gradient_checkpointing_kwargs:
use_reentrant: false
learning_rate: 3.0e-06
log_completions: false
log_level: info
logging_first_step: true
logging_steps: 1
logging_strategy: steps
lr_scheduler_type: cosine
max_prompt_length: 512
max_completion_length: 1024
max_steps: 50
num_generations: 8
num_train_epochs: 1
overwrite_output_dir: true
# per_device_eval_batch_size: 16
per_device_train_batch_size: 16
push_to_hub: false
report_to:
- wandb
reward_funcs:
- accuracy
- format
reward_weights:
- 1.0
- 1.0
save_strategy: steps
save_steps: 100
save_total_limit: 1
seed: 42
warmup_ratio: 0.1

qgallouedec · 2025-05-02T21:18:00Z

closed via #3394

toslali-ibm and others added 5 commits April 28, 2025 19:16

Introduce vLLM colocation

6730711

Introduce TP in colocation mode

87717c7

Fix sleep flag for sleeping

d7451ca

Fix vllm colocation flag comments

266b3d2

Merge branch 'main' into tponcolocation

b7c382e

qgallouedec closed this May 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce TP on top of vLLM colocation #3393

Introduce TP on top of vLLM colocation #3393

Uh oh!

toslali-ibm commented Apr 29, 2025

Uh oh!

qgallouedec commented May 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Introduce TP on top of vLLM colocation #3393

Introduce TP on top of vLLM colocation #3393

Uh oh!

Conversation

toslali-ibm commented Apr 29, 2025

What does this PR do?

Uh oh!

qgallouedec commented May 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants