Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

talk-llama : add check for deepseek-r1-qwen in llama-vocab.cpp #2769

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kristianmk
Copy link

talk-llama: Add a check for deepseek-r1-qwen in llama-vocab.cpp to be able to run models like unsloth/DeepSeek-R1-Distill-Qwen-32B from HuggingFace. A full sync of llama.cpp could be better if that is automated somehow.

Solves the following unknown pre-tokenizer error when running with DeepSeek-R1-Distill-Qwen-32B:
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen' llama_model_load_from_file: failed to load model No llama.cpp model specified. Please provide using -ml <modelfile>

@foldl
Copy link
Collaborator

foldl commented Feb 5, 2025

@ggerganov Could talk-llama be moved into llama.cpp? Sync whisper.cpp into llama.cpp looks simpler and less frequent.

@ggerganov
Copy link
Owner

@ggerganov Could talk-llama be moved into llama.cpp? Sync whisper.cpp into llama.cpp looks simpler and less frequent.

It will simplify the sync, yes, but we would need to introduce SDL2 support to llama.cpp examples. And currently it would be used just for this single example. While in whisper.cpp, more examples use SDL2. So I am not very confident that it would be worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants