-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file #1065
Comments
Thank you for miraculous "unsloth"!! IT was working very well las week. Now, i am having the same problem that @thackmann: My notebook -> transformers 4.44.2 (the same last week). Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file |
Same issue! |
1 similar comment
Same issue! |
same issue. |
facing similar issues is there a fix?? I m blocked! |
same issue with llama3.2 3B , any solution please |
Hey guys working on a fix. The new transformers version kind of broke everything |
Same issue .. anyone have an idea where the problem is located. |
Yes. I tried to work around. Using llama.cpp but it didnt worked. The issues arise when we fine-tune and save the model. |
Same issue. Huge bummer - literally spent hours fine tuning and uploading to HF to get these error the past couple of days thinking it was me. |
same issue here. thank you @shimmyshimmer for working on the fix! |
Hey guys. Yes, this is a current issue. But the boys are working to fix it. If you saved LORA, you might not have to rerun training. |
There is a workaround that was posted here and it worked for me. |
This will not work for Llama 3.2 models. |
same issue!! |
same issue |
same issue here, any fix anyone? here is the error i get aftery trying to run a ft model via ollama Error: llama runner process has terminated: error loading modelvocabulary: cannot find tokenizer merges in model file |
I have same issue with llama 3 |
Apologies guys - was out for a few days and its been hectic, so sorry on the delay!! Will get to the bottom of fix and hopefully can fix it today! Sorry and thank you all for your patience! |
I can reproduce the error - in fact all of llama.cpp and thus Ollama etc do not work with |
@danielhanchen |
I just communicated with the Hugging Face team - they will upstream updates to I re-uploaded all Please try again and see if it works! This unfortunately means you need to re-finetune the model if you did not save the 16bit merged HF weights or the LoRA weights - extreme apologiesnif you saved them, simply update Unsloth then reload them and convert to GGUF. Update Unsloth via:
I will update everyone once the Hugging Face team resolves the issue! Sorry again! Pinging everyone (and apologies for the issues and inconvenience again!!) @xmaayy @avvRobertoAlma @thackmann @kingabzpro @williamzebrowskI @FotieMConstant @laoc81 @gianmarcoalessio @ThaisBarrosAlvim @Franky-W @Saber120 @adampetr @David33706 @Mukunda-Gogoi |
Thanks @danielhanchen, and sorry for the disturbances; to give the context as to what is happening here, we updated the format of merges serialization in The change was done to be backwards-compatible : However, it could not be forwards-compatible: if a file is serialized with the new format, older versions of This is why we're seeing this issue: new files are serialized using the new version, and these files are not loadable in llama.cpp, yet. We're updating all other codepaths (namely llama.cpp) to adapt to the new version. Once that is shipped, all your trained checkpoints will be directly loadable as usual. We're working with llama.cpp to ship this as fast as possible. Thank you! Issue tracker in llama.cpp: ggerganov/llama.cpp#9692 |
Sorry for the poor wording! Yep so if anyone has already saved the LoRA or 16bit weights (before converting to GGUF or ollama) you can reload it in Unsloth then save again after updating unsloth as a temporary solution as well. |
Thank you for the update! I followed the steps you provided, and I’m happy to report that it worked perfectly on my end. I updated Unsloth, reloaded the saved weights, and successfully converted them to GGUF. Everything is running smoothly now with the transformers==4.44.2 fix. I appreciate the quick re-upload and the detailed instructions. I’ll keep an eye out for the official update from Hugging Face, but for now, everything seems to be working great. Thanks again for your efforts! Best regards, |
Thank you @danielhanchen for the quick fix. The original notebook is now working. |
The fix is not working on Kaggle. |
I get this error when i run the collab after applying the changes, seems to be an issue |
@kingabzpro I just updated pypi so |
It is working on Kaggle now. Thank you. |
i'm sorry , but in still error in Version: 2024.12.4
numpy version is
i have save the adapter model,and convert to gguf ,when i run it in ollama ,it's still the same error: Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file |
@lastrei Apologies - do you know which model exactly? |
thanks danielhanchen, |
I am getting a similar issue with the model Torch: 2.5.1, Cuda toolkit 12.1 |
I encountered the same problem a few hours ago.
Then, ollama can run the model w/o problems. FYR. My env: Name: numpy Name: transformers Name: unsloth |
thanks JohnWangCH,i will try it it's works! |
Thank you for developing this useful resource. The Ollama notebook reports
{"error":"llama runner process has terminated: error loading modelvocabulary: cannot find tokenizer merges in model file"}
This is the notebook with the error. It is a copy of the original notebook.
This seems similar to the issue reported in #1062.
The text was updated successfully, but these errors were encountered: