[TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file #1065

thackmann · 2024-09-27T04:22:22Z

Thank you for developing this useful resource. The Ollama notebook reports

{"error":"llama runner process has terminated: error loading modelvocabulary: cannot find tokenizer merges in model file"}

This is the notebook with the error. It is a copy of the original notebook.

This seems similar to the issue reported in #1062.

The text was updated successfully, but these errors were encountered:

laoc81 · 2024-09-27T09:49:24Z

Thank you for miraculous "unsloth"!! IT was working very well las week.

Now, i am having the same problem that @thackmann:

My notebook -> transformers 4.44.2 (the same last week).

Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file

xmaayy · 2024-09-27T15:40:32Z

Same issue!

ThaisBarrosAlvim · 2024-09-28T00:39:14Z

Same issue!

kingabzpro · 2024-09-28T11:45:27Z

same issue.

Mukunda-Gogoi · 2024-09-28T11:58:07Z

facing similar issues is there a fix?? I m blocked!

Saber120 · 2024-09-28T13:21:12Z

same issue with llama3.2 3B , any solution please

shimmyshimmer · 2024-09-28T16:49:28Z

Hey guys working on a fix. The new transformers version kind of broke everything

adampetr · 2024-09-28T17:24:56Z

Same issue .. anyone have an idea where the problem is located.

kingabzpro · 2024-09-28T17:49:32Z

same issue with llama3.2 3B , any solution please

Yes. I tried to work around. Using llama.cpp but it didnt worked. The issues arise when we fine-tune and save the model.

williamzebrowskI · 2024-09-28T19:21:45Z

Same issue. Huge bummer - literally spent hours fine tuning and uploading to HF to get these error the past couple of days thinking it was me.

Franky-W · 2024-09-29T09:01:58Z

same issue here.

thank you @shimmyshimmer for working on the fix!

mahiatlinux · 2024-09-29T09:14:36Z

Hey guys. Yes, this is a current issue. But the boys are working to fix it. If you saved LORA, you might not have to rerun training.

williamzebrowskI · 2024-09-29T16:00:42Z

There is a workaround that was posted here and it worked for me.

#1062 (comment)

kingabzpro · 2024-09-29T16:27:44Z

There is a workaround that was posted here and it worked for me.

#1062 (comment)

This will not work for Llama 3.2 models.

gianmarcoalessio · 2024-09-29T16:37:04Z

same issue!!

David33706 · 2024-09-29T18:01:20Z

same issue

FotieMConstant · 2024-09-29T21:23:55Z

same issue here, any fix anyone?

here is the error i get aftery trying to run a ft model via ollama

Error: llama runner process has terminated: error loading modelvocabulary: cannot find tokenizer merges in model file

avvRobertoAlma · 2024-09-29T23:17:13Z

I have same issue with llama 3
llama.cpp error: 'error loading model vocabulary: cannot find tokenizer merges in model file
'

danielhanchen · 2024-09-30T06:00:46Z

Apologies guys - was out for a few days and its been hectic, so sorry on the delay!! Will get to the bottom of fix and hopefully can fix it today! Sorry and thank you all for your patience!

danielhanchen · 2024-09-30T09:16:17Z

I can reproduce the error - in fact all of llama.cpp and thus Ollama etc do not work with transformers>=4.45.1 - I'll update everyone on a fix - it looks like HuggingFace's update most likely broke something in tokenizer exports

drsanta-1337 · 2024-09-30T09:50:37Z

@danielhanchen
check this comment out, see if it helps.

huggingface/tokenizers#1553 (comment)

danielhanchen · 2024-09-30T10:01:22Z

I just communicated with the Hugging Face team - they will upstream updates to llama.cpp later in the week. It seems like tokenizers>=0.20.0 is the culprit.

I re-uploaded all Llama-3.2 models and as a temporary fix, Unsloth will use transformers==4.44.2.

Please try again and see if it works! This unfortunately means you need to re-finetune the model if you did not save the 16bit merged HF weights or the LoRA weights - extreme apologiesnif you saved them, simply update Unsloth then reload them and convert to GGUF.

Update Unsloth via:

pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

I will update everyone once the Hugging Face team resolves the issue! Sorry again!

Pinging everyone (and apologies for the issues and inconvenience again!!) @xmaayy @avvRobertoAlma @thackmann @kingabzpro @williamzebrowskI @FotieMConstant @laoc81 @gianmarcoalessio @ThaisBarrosAlvim @Franky-W @Saber120 @adampetr @David33706 @Mukunda-Gogoi

LysandreJik · 2024-09-30T10:25:06Z

Thanks @danielhanchen, and sorry for the disturbances; to give the context as to what is happening here, we updated the format of merges serialization in tokenizers to be much more flexible (this was done in this commit):

The change was done to be backwards-compatible : tokenizers and all libraries that depend on it will keep the ability to load merge files which were serialized in the old way.

However, it could not be forwards-compatible: if a file is serialized with the new format, older versions of tokenizers will not be able to load it.

This is why we're seeing this issue: new files are serialized using the new version, and these files are not loadable in llama.cpp, yet. We're updating all other codepaths (namely llama.cpp) to adapt to the new version. Once that is shipped, all your trained checkpoints will be directly loadable as usual. We're working with llama.cpp to ship this as fast as possible.

Thank you!

Issue tracker in llama.cpp: ggerganov/llama.cpp#9692

danielhanchen · 2024-09-30T10:40:34Z

Sorry for the poor wording! Yep so if anyone has already saved the LoRA or 16bit weights (before converting to GGUF or ollama) you can reload it in Unsloth then save again after updating unsloth as a temporary solution as well.

Saber120 · 2024-09-30T11:42:45Z

I just communicated with the Hugging Face team - they will upstream updates to llama.cpp later in the week. It seems like tokenizers>=0.20.0 is the culprit.

I re-uploaded all Llama-3.2 models and as a temporary fix, Unsloth will use transformers==4.44.2.

Please try again and see if it works! This unfortunately means you need to re-finetune the model if you did not save the 16bit merged HF weights or the LoRA weights - extreme apologiesnif you saved them, simply update Unsloth then reload them and convert to GGUF.

Update Unsloth via:
pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
I will update everyone once the Hugging Face team resolves the issue! Sorry again!

Pinging everyone (and apologies for the issues and inconvenience again!!) @xmaayy @avvRobertoAlma @thackmann @kingabzpro @williamzebrowskI @FotieMConstant @laoc81 @gianmarcoalessio @ThaisBarrosAlvim @Franky-W @Saber120 @adampetr @David33706 @Mukunda-Gogoi

Thank you for the update! I followed the steps you provided, and I’m happy to report that it worked perfectly on my end. I updated Unsloth, reloaded the saved weights, and successfully converted them to GGUF. Everything is running smoothly now with the transformers==4.44.2 fix.

I appreciate the quick re-upload and the detailed instructions. I’ll keep an eye out for the official update from Hugging Face, but for now, everything seems to be working great.

Thanks again for your efforts!

Best regards,

thackmann · 2024-09-30T17:28:13Z

Thank you @danielhanchen for the quick fix. The original notebook is now working.

kingabzpro · 2024-09-30T18:40:46Z

The fix is not working on Kaggle.

FotieMConstant · 2024-09-30T21:17:33Z

I just communicated with the Hugging Face team - they will upstream updates to llama.cpp later in the week. It seems like tokenizers>=0.20.0 is the culprit.

I re-uploaded all Llama-3.2 models and as a temporary fix, Unsloth will use transformers==4.44.2.

Please try again and see if it works! This unfortunately means you need to re-finetune the model if you did not save the 16bit merged HF weights or the LoRA weights - extreme apologiesnif you saved them, simply update Unsloth then reload them and convert to GGUF.

Update Unsloth via:
pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
I will update everyone once the Hugging Face team resolves the issue! Sorry again!

Pinging everyone (and apologies for the issues and inconvenience again!!) @xmaayy @avvRobertoAlma @thackmann @kingabzpro @williamzebrowskI @FotieMConstant @laoc81 @gianmarcoalessio @ThaisBarrosAlvim @Franky-W @Saber120 @adampetr @David33706 @Mukunda-Gogoi

I get this error when i run the collab after applying the changes, seems to be an issue

danielhanchen · 2024-10-01T05:27:37Z

@kingabzpro I just updated pypi so pip install unsloth should have the temporary fixes - you might have to restart Kaggle

kingabzpro · 2024-10-01T12:49:14Z

@kingabzpro I just updated pypi so pip install unsloth should have the temporary fixes - you might have to restart Kaggle

It is working on Kaggle now. Thank you.

lastrei · 2024-12-12T01:25:48Z

i'm sorry , but in still error in Version: 2024.12.4
i install unsloth with pip install "unsloth[cu121-torch240] @ git+https://github.com/unslothai/unsloth.git"
the transformer version is

Name: transformers
Version: 4.46.3

numpy version is

Name: numpy
Version: 1.26.4

i have save the adapter model,and convert to gguf ,when i run it in ollama ,it's still the same error:

Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file

danielhanchen · 2024-12-12T09:03:13Z

@lastrei Apologies - do you know which model exactly?

lastrei · 2024-12-13T00:44:59Z

@lastrei Apologies - do you know which model exactly?

thanks danielhanchen，
the model is unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
btw:
I used llama-cpp to manually convert apater to gguf, which can be used in ollama with basemodel. This transformer upgrade brought a lot of troubles, and there are also corresponding llama-cpp problems：#748

ethanelasky · 2024-12-13T05:14:48Z

I am getting a similar issue with the model meta-llama/Llama-3.1-8B-Instruct and the same numpy and transformers versions as @lastrei.

Torch: 2.5.1, Cuda toolkit 12.1

JohnWangCH · 2024-12-13T05:39:14Z

I encountered the same problem a few hours ago.
The problem has been solved with the following steps on my end:

Rebuild llama.cpp with the follow commands:

cd llama.cpp
git checkout git checkout a6744e43e80f4be6398fc7733a01642c846dce1d
git submodule update --init --recursive
make clean
make all -j

Call model.save_pretrained_gguf again

if True: model.save_pretrained_gguf("model", tokenizer,)

Then, ollama can run the model w/o problems. FYR.

My env:
model_name = "unsloth/Llama-3.2-3B-Instruct",

Name: numpy
Version: 1.26.4

Name: transformers
Version: 4.46.3

Name: unsloth
Version: 2024.12.4

lastrei · 2024-12-14T04:02:16Z

I encountered the same problem a few hours ago. The problem has been solved with the following steps on my end:

Rebuild llama.cpp with the follow commands:
cd llama.cpp
git checkout git checkout a6744e43e80f4be6398fc7733a01642c846dce1d
git submodule update --init --recursive
make clean
make all -j
Call model.save_pretrained_gguf again
if True: model.save_pretrained_gguf("model", tokenizer,)
Then, ollama can run the model w/o problems. FYR.

My env: model_name = "unsloth/Llama-3.2-3B-Instruct",

Name: numpy Version: 1.26.4

Name: transformers Version: 4.46.3

Name: unsloth Version: 2024.12.4

thanks JohnWangCH，i will try it

it's works!

shimmyshimmer added currently fixing Am fixing now! URGENT BUG Urgent bug help wanted Help from the OSS community wanted! labels Sep 29, 2024

danielhanchen changed the title ~~Ollama: cannot find tokenizer merges in model file~~ Ollama / llama.cpp: cannot find tokenizer merges in model file Sep 30, 2024

danielhanchen removed the help wanted Help from the OSS community wanted! label Sep 30, 2024

danielhanchen mentioned this issue Sep 30, 2024

[TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file [duplicate] #1062

Open

danielhanchen added fixed - pending confirmation Fixed, waiting for confirmation from poster and removed currently fixing Am fixing now! labels Sep 30, 2024

danielhanchen changed the title ~~Ollama / llama.cpp: cannot find tokenizer merges in model file~~ [TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file Sep 30, 2024

danielhanchen pinned this issue Sep 30, 2024

danielhanchen unpinned this issue Oct 19, 2024

[TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file #1065

[TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file #1065

Comments

thackmann commented Sep 27, 2024

laoc81 commented Sep 27, 2024 • edited Loading

xmaayy commented Sep 27, 2024

ThaisBarrosAlvim commented Sep 28, 2024 • edited Loading

kingabzpro commented Sep 28, 2024

Mukunda-Gogoi commented Sep 28, 2024

Saber120 commented Sep 28, 2024 • edited Loading

shimmyshimmer commented Sep 28, 2024

adampetr commented Sep 28, 2024

kingabzpro commented Sep 28, 2024

williamzebrowskI commented Sep 28, 2024 • edited Loading

Franky-W commented Sep 29, 2024

mahiatlinux commented Sep 29, 2024

williamzebrowskI commented Sep 29, 2024

kingabzpro commented Sep 29, 2024

gianmarcoalessio commented Sep 29, 2024

David33706 commented Sep 29, 2024

FotieMConstant commented Sep 29, 2024 • edited Loading

avvRobertoAlma commented Sep 29, 2024

danielhanchen commented Sep 30, 2024

danielhanchen commented Sep 30, 2024

drsanta-1337 commented Sep 30, 2024 • edited Loading

danielhanchen commented Sep 30, 2024 • edited Loading

LysandreJik commented Sep 30, 2024 • edited Loading

danielhanchen commented Sep 30, 2024

Saber120 commented Sep 30, 2024

thackmann commented Sep 30, 2024

kingabzpro commented Sep 30, 2024

FotieMConstant commented Sep 30, 2024

danielhanchen commented Oct 1, 2024

kingabzpro commented Oct 1, 2024

lastrei commented Dec 12, 2024

danielhanchen commented Dec 12, 2024

lastrei commented Dec 13, 2024

ethanelasky commented Dec 13, 2024 • edited Loading

JohnWangCH commented Dec 13, 2024

lastrei commented Dec 14, 2024 • edited Loading

laoc81 commented Sep 27, 2024 •

edited

Loading

ThaisBarrosAlvim commented Sep 28, 2024 •

edited

Loading

Saber120 commented Sep 28, 2024 •

edited

Loading

williamzebrowskI commented Sep 28, 2024 •

edited

Loading

FotieMConstant commented Sep 29, 2024 •

edited

Loading

drsanta-1337 commented Sep 30, 2024 •

edited

Loading

danielhanchen commented Sep 30, 2024 •

edited

Loading

LysandreJik commented Sep 30, 2024 •

edited

Loading

ethanelasky commented Dec 13, 2024 •

edited

Loading

lastrei commented Dec 14, 2024 •

edited

Loading