-
Notifications
You must be signed in to change notification settings - Fork 55
Description
按照CHORD的教程安装后 我输入了以下代码开始运行:
trinity run --config examples/mix_chord/mix_chord.yaml
但是遇到了以下报错:
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] Error in Trainer:
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] Traceback (most recent call last):
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] File "/home/Trinity-RFT/trinity/trainer/trainer.py", line 79, in train (Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] exps, metrics, repr_samples = await sample_task
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] File "/home/anaconda3/envs/trinity-vllm/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 493, in _resume_span
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] return await method(self, *_args, **_kwargs)
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] File "/home/Trinity-RFT/trinity/trainer/trainer.py", line 125, in _sample_data
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] batch, metrics, repr_samples = await self.sample_strategy.sample( (Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] File "/home/Trinity-RFT/trinity/algorithm/sample_strategy/mix_sample_strategy.py", line 59, in sample
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] expert_exp_list = await self.expert_exp_buffer.read_async()
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] File "/home/Trinity-RFT/trinity/buffer/reader/file_reader.py", line 97, in read_async
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] return self.read(batch_size)
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] File "/home/Trinity-RFT/trinity/buffer/reader/file_reader.py", line 157, in read
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] task = self.formatter.format(sample)
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] File "/home/Trinity-RFT/trinity/buffer/schema/formatter.py", line 62, in format
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] assert workflow_cls is not None, "default_workflow_type or workflow_key is required"
(Trainer pid=2149038) ERROR 09-21 01:03:49 [trainer.py:93] AssertionError: default_workflow_type or workflow_key is required
于是我在mix_chord.yaml--trainer_input.auxiliary_buffers.sft_dataset.auxiliary_buffers中添加了default_workflow_type属性:
default_workflow_type: 'math_boxed_workflow'
但是添加之后出现了新的报错:
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] Error in Trainer:
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] Traceback (most recent call last):
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] File "/home/Trinity-RFT/trinity/trainer/trainer.py", line 79, in train
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] exps, metrics, repr_samples = await sample_task
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] File "/home/anaconda3/envs/trinity-vllm/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 493, in _resume_span
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] return await method(self, *_args, **_kwargs)
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] File "/home/Trinity-RFT/trinity/trainer/trainer.py", line 125, in _sample_data
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] batch, metrics, repr_samples = await self.sample_strategy.sample(
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] File "/home/Trinity-RFT/trinity/algorithm/sample_strategy/mix_sample_strategy.py", line 65, in sample
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] exp.tokens[exp.prompt_length :], dtype=torch.float32
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93] AttributeError: 'Task' object has no attribute 'tokens'
(Trainer pid=2763112) ERROR 09-21 15:58:59 [trainer.py:93]
(Trainer pid=2763112) INFO 09-21 15:58:59 [trainer.py:186] Saving checkpoint at step 0...
我看了一下exp变量 最初是来源于mix_chord.yaml中的trainer_input.experience_buffer:
trainer_input:
experience_buffer:
name: math_buffer
storage_type: queue
path: 'sqlite:///test_mix_chord.db'
所以代码报错是因为我没有设置好mix_chord.yaml吗?我的yaml设置如下:
project: "mix_chord"
name: "test_mix_chord"
checkpoint_root_dir: ${oc.env:TRINITY_CHECKPOINT_ROOT_DIR,./checkpoints}
algorithm:
algorithm_type: mix_chord
repeat_times: 8 # or 16 for better performance in math related tasks
kl_loss_fn_args:
kl_coef: 0.0
sample_strategy_args:
expert_data_ratio: 0.20
policy_loss_fn_args: # feel free to change, we encourage you to try out different hyperparameters
mu_warmup_steps: 200 # 0 for chord-mu and chord-phi
mu_decay_steps: 400 # 200 for chord-mu and 0 for chord-phi
mu_peak: 0.5 # 0.9 for chord-mu and 0.1 for chord-phi
mu_valley: 0.02 # 0.05 for chord-mu and 0.1 for chord-phi
enable_phi_function: true # false for chord-mu and true for chord-phi
clip_range: 0.2
use_token_level_loss_in_sft: true
use_dynamic_bsz: true
ppo_mini_batch_size: 320 # 320 = 256 + 64; if you set repeat times = 16, then it shoudle be 32 * 16 + 64
ppo_micro_batch_size_per_gpu: 4
ngpus_trainer: 4
train_batch_size_expert: 64
train_batch_size_usual: 256 # 32 batchsize * 8 repeat times
model:
model_path: ${oc.env:TRINITY_MODEL_PATH,/home/Trinity-RFT/download_models/Qwen/Qwen2.5-1.5B-Instruct}
max_response_tokens: 10240
max_model_len: 11264
cluster:
node_num: 1
gpu_per_node: 8
buffer:
total_epochs: 4
batch_size: 32
train_batch_size: 320
explorer_input:
taskset:
name: openr1_data_filtered_int
storage_type: file
path: ${oc.env:TRINITY_TASKSET_PATH, /home/Trinity-RFT/openr1_rl_dataset}
format:
prompt_key: 'problem'
response_key: 'answer'
rollout_args:
temperature: 1.0
logprobs: 0
workflow_args:
with_think: true
eval_tasksets: [] # you can add your own eval tasksets here
#default_workflow_type: 'math_boxed_workflow'
default_workflow_type: 'math_boxed_workflow'
trainer_input:
experience_buffer:
name: math_buffer
storage_type: queue
path: 'sqlite:///test_mix_chord.db'
auxiliary_buffers:
sft_dataset:
total_epochs: 25
name: SFT_data
storage_type: file
path: ${oc.env:TRINITY_SFT_DATASET_PATH,open-r1/Mixture-of-Thoughts/all}
split: 'train'
format:
prompt_type: messages
messages_key: 'messages'
#workflow_key: 'math_workflow'
default_workflow_type: 'math_boxed_workflow'
explorer:
eval_interval: 10
runner_per_model: 8
rollout_model:
engine_num: 4
tensor_parallel_size: 1
enable_prefix_caching: false
enforce_eager: true
dtype: bfloat16
seed: 42
gpu_memory_utilization: 0.3
synchronizer:
sync_method: 'nccl'
sync_interval: 1
sync_timeout: 1200
trainer:
save_interval: 50
trainer_config:
actor_rollout_ref:
model:
use_remove_padding: true
actor:
use_dynamic_bsz: true
ppo_max_token_len_per_gpu: 25600
ulysses_sequence_parallel_size: 2
optim:
lr: 1e-6 # or 5e-6, larger lr with warm up can result in better performance for SFT training.
ref:
log_prob_use_dynamic_bsz: ${trainer.trainer_config.actor_rollout_ref.actor.use_dynamic_bsz}
log_prob_max_token_len_per_gpu: ${trainer.trainer_config.actor_rollout_ref.actor.ppo_max_token_len_per_gpu}
ulysses_sequence_parallel_size: ${trainer.trainer_config.actor_rollout_ref.actor.ulysses_sequence_parallel_size}
#monitor:
#monitor_type: wandb
#monitor_type: none
如果有时间,恳请大家指点,不胜感激!