onlinedpo error when use deepspeed zero3 #7

August-murr · 2025-01-06T17:26:10Z

System Info

`
transformers 4.47.0
triton 3.0.0
trl 0.12.1
trove-classifiers 2024.10.21.16
truststore 0.8.0
typer 0.14.0
types-dataclasses 0.6.6
typing_extensions 4.12.2
typing-inspect 0.9.0
tzdata 2024.2
tzlocal 5.2
ujson 5.10.0
urllib3 2.2.2
utils 1.0.2
uvicorn 0.32.1
uvloop 0.21.0
virtualenv 20.28.0
vllm 0.6.3
vllm-flash-attn 2.6.1

trl env
`Copy-paste the following information when reporting an issue:

Platform: Linux-5.4.143-2-velinux1-amd64-x86_64-with-glibc2.35
Python version: 3.11.9
PyTorch version: 2.4.0
CUDA device(s): NVIDIA A100-SXM4-80GB, NVIDIA A100-SXM4-80GB, NVIDIA A100-SXM4-80GB, NVIDIA A100-SXM4-80GB, NVIDIA A100-SXM4-80GB, NVIDIA A100-SXM4-80GB, NVIDIA A100-SXM4-80GB, NVIDIA A100-SXM4-80GB
Transformers version: 4.47.0
Accelerate version: 1.1.1
Accelerate config: not found
Datasets version: 3.1.0
HF Hub version: 0.26.3
TRL version: 0.12.1
bitsandbytes version: 0.45.0
DeepSpeed version: 0.16.1
Diffusers version: not installed
Liger-Kernel version: not installed
LLM-Blender version: 0.0.2
OpenAI version: 1.57.0
PEFT version: 0.13.2`

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

class UnifiedDPODataset(Dataset):
"""
统一的DPO数据集
"""
def init(self, file, tokenizer, max_seq_length, max_prompt_length, template,
maximum_es_score,minimum_es_score,bool_training:bool):
self.tokenizer = tokenizer
self.template_name = template.template_name
#==None
self.system_format = template.system_format
self.user_format = template.user_format
self.assistant_format = template.assistant_format
self.system = template.system

    self.max_seq_length = max_seq_length
    self.max_prompt_length = max_prompt_length
    logger.info('Loading data: {}'.format(file))
    with open(file, 'r', encoding='utf-8') as f:
        raw_data_list = f.readlines()
        #根据key=es_score过滤数据
        for check_data in raw_data_list:
            try:
                json.loads(check_data)
            except json.decoder.JSONDecodeError as e:
                print(f'JSONDecodeError={e.args},check_data={check_data}')
        # data_list = [json.loads(data) for data in raw_data_list 
        #     if float(json.loads(data)['es_score']) >= minimum_es_score 
        #     and float(json.loads(data)['es_score']) <= maximum_es_score]
        if bool_training:
            data_list=[]
            for data_str_iter in raw_data_list:
                data_json_iter=json.loads(data_str_iter)
                if isinstance(data_json_iter['es_score'],dict):
                    es_score=max([float(elem) for elem in list(data_json_iter['es_score'].values())])
                else:
                    es_score=float(data_json_iter['es_score'])
                if es_score >= minimum_es_score and es_score <= maximum_es_score:
                    data_list.append(data_json_iter)
        else:
            data_list=[json.loads(data_str_iter) for data_str_iter in raw_data_list]
    logger.info(f"Use template {self.template_name} for training,bool_training={bool_training},There are {len(data_list)} data in dataset,原始数据量={len(raw_data_list)}")
    self.data_list = data_list

def __len__(self):
    return len(self.data_list)

def build_prompt_input_ids(self, system, history):
    """
    chatglm2: [gMASK]sop [Round 1]\n\n问：{input1}\n\n答：{target1}</s>[Round 2]\n\n问：{input2}\n\n答：{target2}</s>...
    chatglm3: [gMASK]sop <|system|>xxx<|user|>xxx<|assistant|>xxx<eos>
    others: {system_format}{user_format}{assistant_format}{user_format}{assistant_format}...
    """
    # chatglm模型具有特殊的起始token
    if self.template_name in ['chatglm2', 'chatglm3']:
        prompt_input_ids = self.tokenizer.get_prefix_tokens()
    else:
        prompt_input_ids = []
    prompt=''
    # collect system information
    if self.system_format is not None:
        system = system if system is not None else self.system
        # system信息不为空
        if system is not None:
            if self.template_name == 'chatglm3':
                prompt_input_ids += [self.tokenizer.get_command(f"<|system|>")] + self.tokenizer.encode(system, add_special_tokens=False)
            else:
                system_text = self.system_format.format(content=system)
                prompt_input_ids += self.tokenizer.encode(system_text, add_special_tokens=False)
            prompt+=system_text
    # collect history
    ##将 user/assist 的 multi-turn  prompt/input_ids 拼接
    for i, conv in enumerate(history):
        role = conv['role'].strip()
        content = conv['content'].strip()

        assert role != 'system', 'there should not be more than one system information'
        text_iter=''
        if role == 'user':
            if self.template_name == 'chatglm2':
                human = self.user_format.format(content=content, idx=i//2 + 1)
                input_ids = self.tokenizer.encode(human, add_special_tokens=False)
            elif self.template_name == 'chatglm3':
                input_ids = [self.tokenizer.get_command(f"<|user|>")] + \
                            self.tokenizer.encode(content, add_special_tokens=False) + \
                            [self.tokenizer.get_command(f"<|assistant|>")]
            else:
                human = self.user_format.format(content=content, stop_token=self.tokenizer.eos_token)
                input_ids = self.tokenizer.encode(human, add_special_tokens=False)
            text_iter=human
        elif role == 'assistant':
            if self.template_name in ['chatglm2', 'chatglm3']:
                input_ids = self.tokenizer.encode(content, add_special_tokens=False) + [self.tokenizer.eos_token_id]
            else:
                assistant = self.assistant_format.format(content=content, stop_token=self.tokenizer.eos_token)
                input_ids = self.tokenizer.encode(assistant, add_special_tokens=False)
            text_iter=assistant
        else:
            raise Exception('role error')
        prompt_input_ids += input_ids
        prompt += text_iter

    return prompt_input_ids,prompt

def __getitem__(self, index):
    data = self.data_list[index]
    # data = json.loads(data)
    chosen = data['chosen']
    rejected = data['rejected']
    assert len(chosen) == len(rejected)

    # 判断第0个是否为system
    if chosen[0]['role'] == 'system':
        system = chosen[0]['content'].strip()
        history = chosen[1:-1]  # 对话上文
        chosen, rejected = chosen[-1], rejected[-1]
    else:
        # user/assist ,单轮 history为空
        system = None
        history = chosen[:-1]  # 对话上文
        ##chosen/rejected 最后一轮，assist的回复
        chosen, rejected = chosen[-1], rejected[-1]

    # build prompt 
    #构建 system, history 部分
    prompt_input_ids,prompt = self.build_prompt_input_ids(system, history)

    # build response
    if self.template_name in ['chatglm2', 'chatglm3']:
        chosen_input_ids = self.tokenizer.encode(chosen['content'], add_special_tokens=False) + [self.tokenizer.eos_token_id]
        rejected_input_ids = self.tokenizer.encode(rejected['content'], add_special_tokens=False) + [self.tokenizer.eos_token_id]
    else:
        #chosen content 对应的prompt
        chosen = self.assistant_format.format(content=chosen['content'], stop_token=self.tokenizer.eos_token)
        #rejected content 对应的prompt
        rejected = self.assistant_format.format(content=rejected['content'], stop_token=self.tokenizer.eos_token)

        chosen_input_ids = self.tokenizer.encode(chosen, add_special_tokens=False)
        rejected_input_ids = self.tokenizer.encode(rejected, add_special_tokens=False)

    # truncate by max_seq_length
    ##todo 需要在生成语料时候对最长的声场加上截断,过滤筛选,防止过长
    longer_response_length = max(len(chosen_input_ids), len(rejected_input_ids))
    # if combined sequence is too long, truncate the prompt
    if len(prompt_input_ids) + longer_response_length > self.max_seq_length:
        #取 static的 max_prompt_length  和  max_seq_length - longer_response_length 的最大值
        max_prompt_length = max(self.max_prompt_length, self.max_seq_length - longer_response_length)
        #截断
        prompt_input_ids = prompt_input_ids[-max_prompt_length:]
    # if that's still too long, truncate the response
    ##?? 什么情况still too long?
    if len(prompt_input_ids) + longer_response_length > self.max_seq_length:
        chosen_input_ids = chosen_input_ids[: self.max_seq_length - len(prompt_input_ids)]
        rejected_input_ids = rejected_input_ids[: self.max_seq_length - len(prompt_input_ids)]
    chosen_content_of_assist_len=len(chosen_input_ids)
    reject_content_of_assist_len=len(rejected_input_ids)
    chosen_labels = [-100] * len(prompt_input_ids) + chosen_input_ids
    chosen_input_ids = prompt_input_ids + chosen_input_ids
    rejected_labels = [-100] * len(prompt_input_ids) + rejected_input_ids
    rejected_input_ids = prompt_input_ids + rejected_input_ids
    assert len(chosen_labels) == len(chosen_input_ids)
    assert len(rejected_labels) == len(rejected_input_ids)
    if np.random.random()<0.01:
        info_msg=f'longer_response_length={longer_response_length},prompt_input_ids len={len(prompt_input_ids)}'+ \
        f'chosen_答案长度={chosen_content_of_assist_len},reject答案长度={reject_content_of_assist_len}'+\
        f'拼接prompt后chosen长度={len(chosen_input_ids)},拼接prompt后reject长度={len(rejected_input_ids)}'+\
        f'prompt_input_ids={prompt_input_ids},\n chosen_input_ids={chosen_input_ids},\n rejected_input_ids={rejected_input_ids},\n'+\
        f'chosen_labels={chosen_labels},\n rejected_labels={rejected_labels}'
        print(info_msg) 
    inputs = dict(
        prompt_input_ids=prompt_input_ids,
        prompt_attention_mask=[1]*len(prompt_input_ids),
        chosen_input_ids=chosen_input_ids,
        chosen_attention_mask=[1]*len(chosen_input_ids),
        chosen_labels=chosen_labels,
        rejected_input_ids=rejected_input_ids,
        rejected_attention_mask=[1]*len(rejected_input_ids),
        rejected_labels=rejected_labels,
        prompt=prompt
    )
    return inputs

# 为了适配DPOTrainer的接口
def map(self, func, **kwargs):
    return self
# 为了适配DPOTrainer的接口
def map(self, func, **kwargs):
    return self
def select(self,index_list):
    select_data_lsit=[]
    for index in index_list:
        data_iter=self.data_list[index]
        select_data_lsit.append(data_iter)
    return select_data_lsit

class UnifiedOnlineDPODataset(UnifiedDPODataset):
def init(self, file, tokenizer, max_seq_length,template,
maximum_es_score,minimum_es_score,bool_training:bool):
max_prompt_length=max_seq_length
super(UnifiedOnlineDPODataset, self).init(file=file, tokenizer=tokenizer, max_seq_length=max_seq_length,
max_prompt_length=max_prompt_length, template=template,maximum_es_score=maximum_es_score,minimum_es_score=minimum_es_score,
bool_training=bool_training)
def getitem(self, index):
data = self.data_list[index]
# build prompt
#构建 system, history 部分
# 判断第0个是否为system
# chosen = data['chosen']
# if chosen[0]['role'] == 'system':
# system = chosen[0]['content'].strip()
# history = chosen[1:-1] # 对话上文
# chosen = chosen[-1]
# else:
# # user/assist ,单轮 history为空
# system = None
# history = chosen[:-1] # 对话上文
# ##chosen/rejected 最后一轮，assist的回复
# chosen = chosen[-1]
# prompt_input_ids,prompt = self.build_prompt_input_ids(system, history)

    prompt=data['prompt']
    groundtruth=data['groundtruth']
    # self.build_prompt_input_ids(system, history)       
    # prompt_input_ids = self.tokenizer.encode(prompt, add_special_tokens=False) + [self.tokenizer.eos_token_id]
    prompt_input_ids = self.tokenizer.encode(prompt, add_special_tokens=False)
    ##todo assert fim_end
    assert groundtruth.endswith(TC.DS_EOS_TOKEN)
    assert not prompt.endswith(TC.DS_EOS_TOKEN)
    # system = None

    # truncate by max_seq_length
    ##todo 需要在生成语料时候对最长的声场加上截断,过滤筛选,防止过长
    # if combined sequence is too long, truncate the prompt
    if len(prompt_input_ids) > self.max_prompt_length:
        #截断
        prompt_input_ids = prompt_input_ids[-self.max_prompt_length:]        
        decoded_prompt=self.tokenizer.decode(prompt_input_ids,skip_special_tokens=False)
        double_decoded_prompt_ids=self.tokenizer.encode(decoded_prompt,add_special_tokens=False)
        #for check 
        try:
            zipo_decode_tuple_list=list(zip(prompt[-self.max_prompt_length:][::-1],decoded_prompt[-self.max_prompt_length:][::-1]))[::-1]
            zipo_decode_id_tuple_list=list(zip(prompt_input_ids[::-1],double_decoded_prompt_ids[::-1]))[::-1]
            assert decoded_prompt[-self.max_prompt_length:]==prompt[-self.max_prompt_length:]
        except AssertionError as e:
            # print(f'decoded_prompt[-self.max_prompt_length:]=\n{decoded_prompt[-self.max_prompt_length:]},\nprompt[-self.max_prompt_length:]={prompt[-self.max_prompt_length:]},')        
            print(f'decoded_prompt[-self.max_prompt_length:]={decoded_prompt[-self.max_prompt_length:]},\n'+\
                f'prompt[-self.max_prompt_length:]={prompt[-self.max_prompt_length:]},\n'+\
                f'zipo_decode_tuple_list={zipo_decode_tuple_list},\nzipo_decode_id_tuple_list={zipo_decode_id_tuple_list}')
        prompt=decoded_prompt
        # prompt=self.tokenizer.convert_ids_to_tokens(prompt_input_ids)
    inputs = dict(
        # prompt_input_ids=prompt_input_ids,
        prompt_input_ids=prompt_input_ids,
        prompt_attention_mask=[1]*len(prompt_input_ids),
        prompt=prompt,
        groundtruth=groundtruth
    )
    return inputs

Expected behavior

| [rank4]: Traceback (most recent call last): |

| | 2024-12-30 10:53:44.559 | [rank4]: Traceback (most recent call last): |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/lpai-running/code/firefly-zyy-dev/339ecc/shells/../train_onlinedpo.py", line 251, in |
| | 2024-12-30 10:53:44.559 | [rank4]: main() |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/lpai-running/code/firefly-zyy-dev/339ecc/shells/../train_onlinedpo.py", line 195, in main |
| | 2024-12-30 10:53:44.559 | [rank4]: train_result=trainer.train() |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 2164, in train |
| | 2024-12-30 10:53:44.559 | [rank4]: return inner_training_loop( |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 2522, in _inner_training_loop |
| | 2024-12-30 10:53:44.559 | [rank4]: tr_loss_step = self.training_step(model, inputs, num_items_in_batch) |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/lpai-running/code/firefly-zyy-dev/339ecc/models/online_dpo_trainer.py", line 480, in training_step |
| | 2024-12-30 10:53:44.559 | [rank4]: output = unwrapped_model.generate( |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context |
| | 2024-12-30 10:53:44.559 | [rank4]: return func(*args, **kwargs) |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/transformers/generation/utils.py", line 2252, in generate |
| | 2024-12-30 10:53:44.559 | [rank4]: result = self._sample( |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/transformers/generation/utils.py", line 3254, in _sample |
| | 2024-12-30 10:53:44.559 | [rank4]: outputs = model_forward(**model_inputs, return_dict=True) |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl |
| | 2024-12-30 10:53:44.559 | [rank4]: return self._call_impl(*args, **kwargs) |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1603, in _call_impl |
| | 2024-12-30 10:53:44.559 | [rank4]: result = forward_call(*args, **kwargs) |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1163, in forward |
| | 2024-12-30 10:53:44.559 | [rank4]: outputs = self.model( |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl |
| | 2024-12-30 10:53:44.559 | [rank4]: return self._call_impl(*args, **kwargs) |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl |
| | 2024-12-30 10:53:44.559 | [rank4]: return forward_call(*args, **kwargs) |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 883, in forward |
| | 2024-12-30 10:53:44.559 | [rank4]: causal_mask = self._update_causal_mask( |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 993, in _update_causal_mask |
| | 2024-12-30 10:53:44.559 | [rank4]: causal_mask = self._prepare_4d_causal_attention_mask_with_cache_position( |
| | 2024-12-30 10:53:44.559 | [rank4]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| | 2024-12-30 10:53:44.559 | [rank4]: File "/opt/conda/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1060, in _prepare_4d_causal_attention_mask_with_cache_position |
| | 2024-12-30 10:53:44.559 | [rank4]: causal_mask *= torch.arange(target_length, device=device) > cache_position.reshape(-1, 1) |
| | 2024-12-30 10:53:44.559 | [rank4]: RuntimeError: The size of tensor a (4137) must match the size of tensor b (4138) at non-singleton dimension 0 |
| Fieldsapptask-multitask-rl-dev-26d7c8b2containermainfilename/var/log/pods/sc-ep_task-multitask-rl-dev-26d7c8b2-master-0_0dfb4ee9-631c-4e77-8070-46d72626f885/main/0.logjobsc-ep/task-multitask-rl-dev-26d7c8b2namespacesc-epnode_name10.48.7.142podtask-multitask-rl-dev-26d7c8b2-master-0streamstderr | Fields | | app | task-multitask-rl-dev-26d7c8b2 | | container | main | | filename | /var/log/pods/sc-ep_task-multitask-rl-dev-26d7c8b2-master-0_0dfb4ee9-631c-4e77-8070-46d72626f885/main/0.log | | job | sc-ep/task-multitask-rl-dev-26d7c8b2 | | namespace | sc-ep | | node_name | 10.48.7.142 | | pod | task-multitask-rl-dev-26d7c8b2-master-0 | | stream | stderr
Fields
| app | task-multitask-rl-dev-26d7c8b2
| container | main
| filename | /var/log/pods/sc-ep_task-multitask-rl-dev-26d7c8b2-master-0_0dfb4ee9-631c-4e77-8070-46d72626f885/main/0.log
| job | sc-ep/task-multitask-rl-dev-26d7c8b2
| namespace | sc-ep
| node_name | 10.48.7.142
| pod | task-multitask-rl-dev-26d7c8b2-master-0
| stream | stderr
| | 2024-12-30 10:53:44.559 | [rank4]: Exception raised from infer_size_impl at /opt/conda/conda-bld/pytorch_1720538435607/work/aten/src/ATen/ExpandUtils.cpp:31 (most recent call first): |
| | 2024-12-30 10:53:44.559 | [rank4]: C++ CapturedTraceback: |
| | 2024-12-30 10:53:44.559 | [rank4]: #4 std::_Function_handler<std::shared_ptr<c10::LazyValuestd::string const> (), c10::SetStackTraceFetcher(std::function<std::string ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) from Logging.cpp:0 |

| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#40 do_call_core from /usr/local/src/conda/python-3.11.9/Python/ceval.c:7349 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#41 _PyEval_EvalFrame from /usr/local/src/conda/python-3.11.9/Include/internal/pycore_ceval.h:73 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#42 method_vectorcall from /usr/local/src/conda/python-3.11.9/Objects/classobject.c:59 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#43 _PyVectorcall_Call from /usr/local/src/conda/python-3.11.9/Objects/call.c:257 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#44 do_call_core from /usr/local/src/conda/python-3.11.9/Python/ceval.c:7349 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#45 _PyEval_EvalFrame from /usr/local/src/conda/python-3.11.9/Include/internal/pycore_ceval.h:73 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#46 method_vectorcall from /usr/local/src/conda/python-3.11.9/Objects/classobject.c:59 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#47 _PyVectorcall_Call from /usr/local/src/conda/python-3.11.9/Objects/call.c:257 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#48 do_call_core from /usr/local/src/conda/python-3.11.9/Python/ceval.c:7349 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#49 _PyEval_EvalFrame from /usr/local/src/conda/python-3.11.9/Include/internal/pycore_ceval.h:73 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#50 method_vectorcall from /usr/local/src/conda/python-3.11.9/Objects/classobject.c:59 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#51 _PyVectorcall_Call from /usr/local/src/conda/python-3.11.9/Objects/call.c:257 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#52 do_call_core from /usr/local/src/conda/python-3.11.9/Python/ceval.c:7349 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#53 _PyEval_EvalFrame from /usr/local/src/conda/python-3.11.9/Include/internal/pycore_ceval.h:73 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#54 _PyVectorcall_Call from /usr/local/src/conda/python-3.11.9/Objects/call.c:257 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#55 do_call_core from /usr/local/src/conda/python-3.11.9/Python/ceval.c:7349 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#56 _PyEval_EvalFrame from /usr/local/src/conda/python-3.11.9/Include/internal/pycore_ceval.h:73 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#57 method_vectorcall from /usr/local/src/conda/python-3.11.9/Objects/classobject.c:59 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#58 _PyVectorcall_Call from /usr/local/src/conda/python-3.11.9/Objects/call.c:257 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#59 partial_call from /usr/local/src/conda/python-3.11.9/Modules/_functoolsmodule.c:324 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#60 _PyObject_MakeTpCall from /usr/local/src/conda/python-3.11.9/Objects/call.c:214 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#61 _PyObject_VectorcallTstate from /usr/local/src/conda/python-3.11.9/Include/internal/pycore_call.h:92 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#62 _PyEval_EvalFrameDefault from /usr/local/src/conda/python-3.11.9/Python/ceval.c:4769 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#63 _PyEval_EvalFrame from /usr/local/src/conda/python-3.11.9/Include/internal/pycore_ceval.h:73 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#64 PyEval_EvalCode from /usr/local/src/conda/python-3.11.9/Python/ceval.c:1148 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#65 run_eval_code_obj from /usr/local/src/conda/python-3.11.9/Python/pythonrun.c:1741 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#66 run_mod from /usr/local/src/conda/python-3.11.9/Python/pythonrun.c:1762 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#67 pyrun_file from /usr/local/src/conda/python-3.11.9/Python/pythonrun.c:1657 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#68 _PyRun_SimpleFileObject from /usr/local/src/conda/python-3.11.9/Python/pythonrun.c:440 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#69 _PyRun_AnyFileObject from /usr/local/src/conda/python-3.11.9/Python/pythonrun.c:79 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#70 pymain_run_file_obj from /usr/local/src/conda/python-3.11.9/Modules/main.c:360 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#71 Py_BytesMain from /usr/local/src/conda/python-3.11.9/Modules/main.c:734 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#72 __libc_start_call_main from ./csu/../sysdeps/nptl/libc_start_call_main.h:58 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#73 __libc_start_main_impl from ./csu/../csu/libc-start.c:392 |
| | 2024-12-30 10:53:44.559 | [rank4]: huggingface#74 _start from ??:0 |
| | 2024-12-30 10:53:44.559 | |

Checklist

I have checked that my issue isn't already filed (see open issues)
I have included my system information
Any code provided is minimal, complete, and reproducible (more on MREs)
Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
Any traceback provided is complete

The text was updated successfully, but these errors were encountered:

github-actions bot added bug labels Jan 6, 2025

August-murr removed bug labels Jan 6, 2025

github-actions bot added ❓ question Seeking clarification or more information 🏋 Iterative SFT Related to Iterative SFT ✨ enhancement New feature or request 🎯 optimal import sentence 🏋 RLOO Related to RLOO 👖 action-adventure labels Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

onlinedpo error when use deepspeed zero3 #7

onlinedpo error when use deepspeed zero3 #7

August-murr commented Jan 6, 2025 •

edited

Loading

onlinedpo error when use deepspeed zero3 #7

onlinedpo error when use deepspeed zero3 #7

Comments

August-murr commented Jan 6, 2025 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

Checklist

August-murr commented Jan 6, 2025 •

edited

Loading