-
Notifications
You must be signed in to change notification settings - Fork 11.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Steps to support the Dolly model #1308
Conversation
yield from self.added_tokens() | ||
|
||
def __repr__(self) -> str: | ||
return f"<SentencePieceVocab with {self.vocab_size_base} base tokens and {len(self.added_tokens_list)} added tokens>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return f"<SentencePieceVocab with {self.vocab_size_base} base tokens and {len(self.added_tokens_list)} added tokens>" | |
return f"<PretrainedVocab with {self.vocab_size_base} base tokens and {len(self.added_tokens_list)} added tokens>" |
For a GPT-NeoX implementation using But before doing that, we need to add a |
nice, sry for ghost posting but why llama.cpp exists if you have the ggml repo? I am a beginner in terms of AI. |
Hi @ggerganov
I created this one. I can PR it if it is ok. https://github.com/mverrilli/ggml/tree/dolly-v2/examples/dolly-v2 |
Yes, please open a PR. |
It is pretty comparable. The Q5_0 is significantly faster for the larger model. I was not getting good results until I added the special token handling since it would split the special token into two. I posted some sample runs in the README. |
GGML is a general purpose matrix API that doesn’t include support for running specific models directly (I believe). This repo exists to use GGML to implement the specific structures of LLaMA. |
The ggml dolly example results look good. |
Hi teams, any update? |
wrong repository |
is ggml the right one? |
What
Why
I want to use the dolly model with llama.cpp. They use some ByteStorage stuff of torch
Remaining issues
The vocab file is in a completely different format to sentencepiece (it uses a pretrained tokenizer):Somehow it must get converted or another Vocab must be added to convert.py