Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't load tokenizer #603

Open
Guo-Chenxu opened this issue Nov 11, 2023 · 2 comments
Open

can't load tokenizer #603

Guo-Chenxu opened this issue Nov 11, 2023 · 2 comments

Comments

@Guo-Chenxu
Copy link

i run the code with the following instruction:

python finetune.py \
    --base_model='/home/guochenxu/pythonProjects/alpaca-lora/alpaca-lora-7b' \
    --num_epochs=10 \
    --cutoff_len=512 \
    --group_by_length \
    --output_dir='./lora-alpaca-512-qkvo' \
    --lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' \
    --lora_r=16 \
    --micro_batch_size=8

i download the model files from https://huggingface.co/tloen/alpaca-lora-7b, and my directory is like this:

image

but i get the error as follows:

image

it seems i don't have the tokenizer files, so how can i get those, or can i solve this problem with other method?

i'm a beginner, so maybe this problem seems to be a little stupid, but i have tried searching the web, finally my problem is still existing. i would be appreciate, if anyone can answer me.

@hychaochao
Copy link

直接用中文回你啦,不知你解决了没有。我也是刚入门没多久,看你的代码是想在alpaca-lora的基础上再微调?可以这样:

python finetune_copy.py \
    --base_model 'llama1' \
    --data_path ‘XXX.json' \
    --output_dir './lora-alpaca' \
    --resume_from_checkpoint 'tloen/alpaca-lora'

@Guo-Chenxu
Copy link
Author

直接用中文回你啦,不知你解决了没有。我也是刚入门没多久,看你的代码是想在alpaca-lora的基础上再微调?可以这样:

python finetune_copy.py \
    --base_model 'llama1' \
    --data_path ‘XXX.json' \
    --output_dir './lora-alpaca' \
    --resume_from_checkpoint 'tloen/alpaca-lora'

感谢您的回答, 我出问题时因为是下错模型了, 应该用alpaca-7b (不得不说确实挺stupid😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants