Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training on instructions #603

Closed
Oseltamivir opened this issue Jun 7, 2024 · 2 comments · Fixed by #604
Closed

Training on instructions #603

Oseltamivir opened this issue Jun 7, 2024 · 2 comments · Fixed by #604

Comments

@Oseltamivir
Copy link
Contributor

Oseltamivir commented Jun 7, 2024

Similar to #150,

When training on completions/chat/assistant/instruction format where a prompt is given and model is trained only on the response, there are some errors.

Following the HF tutorial, if given a dataset in the tutorial's format using the following formatting function:

def formatting_prompts_func(example):
    output_texts = []
    for i in range(len(example['instruction'])):
        text = f"### Question: {example['instruction'][i]}\n ### Answer: {example['output'][i]}"
        output_texts.append(text)
    return output_texts

An error of AttributeError: 'list' has no attribute 'startswith' is given.

After some digging, I found out that the error stems from patch_sft_trainer_tokenizer, in tokenizer_utils.py:826:

L826 test_text = dataset[0][dataset_text_field] if (formatting_func is None or not use_formatting_func) else formatting_func(dataset[0])\n"
L829 test_text.startswith(tokenizer.bos_token)

If the output_texts is changed to output_texts[0], the AttributeError is resolved but another value error is given during training:

in the TRL trainer:557,

if not isinstance(formatting_func(element), list):
                    raise ValueError

So what I gather, formatting_func(dataset[0]) has to both be a list and a string, which is obviously wrong.

My solution was to changeformatting_func(dataset[0]) to formatting_func(dataset[0])[0] as formatting_func returns a list as per HF and transformers trainer.

But even with this fixed there are some other issues. Will probably submit a PR once those issues are taken cared of as well.


Also, can I ask why is TRL not the latest version? Is there a reason for using SFTTrainer(args = TrainingArguments) instead of SFTConfig?

@Oseltamivir
Copy link
Contributor Author

Update: Got training llama3 on completions/chat/assistant working locally.

Got a ipynb working that can train Llama-3, specifically nvidiachat, on completions similar to that HF CodeAlpaca-20k tutorial , can upload if desired.

However:

  1. Had to modify a portion of TRL's sft_trainer.py as it could not identify nvidia's ASSISTANT: response key in the chat template.
  2. Uses a newer version of TRL and not tested yet on colab, working locally as of now.

@danielhanchen
Copy link
Contributor

Oh very cool! Thanks for this! Will check the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants