-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: cannot find tokenizer merges in model file #9692
Comments
same problem |
gguf-py/gguf/vocab.py and in _try_load_from_tokenizer_json function:
Can be compatible? |
Hey hey, I'm VB from the open source team at Hugging Face. I can confirm that this is due to an update we've made to tokenizers - we persists merges as a list vs strings. Everything should work on transformers For reference this is the |
There are some temporary fixes which downgrade transformers to |
Tagging @compilade for any insights how to best resolve this. |
A couple of repos for testing:
The difference is the way merges are serialized in the |
@pcuenca Thanks, I confirm that if I update to I wonder, should we try to find a way to make |
In my opinion, I think upgrading |
Opened a PR to update the transformers version in the short term: #9694 We tested it with the new format and the old format:
|
Upgrading to llama.cpp/gguf-py/gguf/vocab.py Lines 123 to 126 in 8277a81
To support the new format with older versions of So |
Should be resolved now. @nd791899 please close if it has been resolved. |
What happened?
When I use transformers==4.45.1 and convert llama.cpp to the file used by ollama, there is no error, but when I load the model with ollama, the error ollama cannot find tokenizer merges in model file appears
Name and Version
所有版本
What operating system are you seeing the problem on?
No response
Relevant log output
No response
The text was updated successfully, but these errors were encountered: