Training on instructions #603

Oseltamivir · 2024-06-07T02:37:59Z

Similar to #150,

When training on completions/chat/assistant/instruction format where a prompt is given and model is trained only on the response, there are some errors.

Following the HF tutorial, if given a dataset in the tutorial's format using the following formatting function:

def formatting_prompts_func(example):
    output_texts = []
    for i in range(len(example['instruction'])):
        text = f"### Question: {example['instruction'][i]}\n ### Answer: {example['output'][i]}"
        output_texts.append(text)
    return output_texts

An error of AttributeError: 'list' has no attribute 'startswith' is given.

After some digging, I found out that the error stems from patch_sft_trainer_tokenizer, in tokenizer_utils.py:826:

L826 test_text = dataset[0][dataset_text_field] if (formatting_func is None or not use_formatting_func) else formatting_func(dataset[0])\n"
L829 test_text.startswith(tokenizer.bos_token)

If the output_texts is changed to output_texts[0], the AttributeError is resolved but another value error is given during training:

in the TRL trainer:557,

if not isinstance(formatting_func(element), list):
                    raise ValueError

So what I gather, formatting_func(dataset[0]) has to both be a list and a string, which is obviously wrong.

My solution was to changeformatting_func(dataset[0]) to formatting_func(dataset[0])[0] as formatting_func returns a list as per HF and transformers trainer.

But even with this fixed there are some other issues. Will probably submit a PR once those issues are taken cared of as well.

Also, can I ask why is TRL not the latest version? Is there a reason for using SFTTrainer(args = TrainingArguments) instead of SFTConfig?

The text was updated successfully, but these errors were encountered:

Oseltamivir · 2024-06-07T04:26:17Z

Update: Got training llama3 on completions/chat/assistant working locally.

Got a ipynb working that can train Llama-3, specifically nvidiachat, on completions similar to that HF CodeAlpaca-20k tutorial , can upload if desired.

However:

Had to modify a portion of TRL's sft_trainer.py as it could not identify nvidia's ASSISTANT: response key in the chat template.
Uses a newer version of TRL and not tested yet on colab, working locally as of now.

danielhanchen · 2024-06-07T18:28:30Z

Oh very cool! Thanks for this! Will check the PR!

Oseltamivir mentioned this issue Jun 7, 2024

Fix #603 handling of formatting_func in tokenizer_utils for assitant/chat/completion training #604

Merged

Oseltamivir closed this as completed Aug 3, 2024

Oseltamivir mentioned this issue Aug 3, 2024

On train_on_responses_only #867

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training on instructions #603

Training on instructions #603

Oseltamivir commented Jun 7, 2024 •

edited

Loading

Oseltamivir commented Jun 7, 2024

danielhanchen commented Jun 7, 2024

Training on instructions #603

Training on instructions #603

Comments

Oseltamivir commented Jun 7, 2024 • edited Loading

Oseltamivir commented Jun 7, 2024

danielhanchen commented Jun 7, 2024

Oseltamivir commented Jun 7, 2024 •

edited

Loading