Steps to support the Dolly model #1308

devkral · 2023-05-03T22:09:42Z

What

The current glob for model files is very restricted, I relaxed it a little so it could find the dolly model.
support for ByteStorage was added. As far as I can see it is uint8.
new: PretrainedVocab is added

Why

I want to use the dolly model with llama.cpp. They use some ByteStorage stuff of torch

Remaining issues

~~The vocab file is in a completely different format to sentencepiece (it uses a pretrained tokenizer):~~

~~Somehow it must get converted or another Vocab must be added to convert.py~~

dolly uses a gpt_neox format which is different from what llama.cpp understands. Needs conversion

j-f1 · 2023-05-04T14:28:09Z

convert.py

+        yield from self.added_tokens()
+
+    def __repr__(self) -> str:
+        return f"<SentencePieceVocab with {self.vocab_size_base} base tokens and {len(self.added_tokens_list)} added tokens>"


Suggested change

return f"<SentencePieceVocab with {self.vocab_size_base} base tokens and {len(self.added_tokens_list)} added tokens>"

return f"<PretrainedVocab with {self.vocab_size_base} base tokens and {len(self.added_tokens_list)} added tokens>"

ggerganov · 2023-05-04T15:47:52Z

For a GPT-NeoX implementation using ggml see the StableLM example.
It will require some extra work to integrate this in llama.cpp.

But before doing that, we need to add a ggml example for Dolly and make sure that it works correctly.

devkral · 2023-05-05T07:34:46Z

nice, sry for ghost posting but why llama.cpp exists if you have the ggml repo?

I am a beginner in terms of AI.

mverrilli · 2023-05-05T17:50:54Z

Hi @ggerganov

But before doing that, we need to add a ggml example for Dolly and make sure that it works correctly.

I created this one. I can PR it if it is ok.

https://github.com/mverrilli/ggml/tree/dolly-v2/examples/dolly-v2

ggerganov · 2023-05-05T17:58:53Z

@devkral

Yes, please open a PR.
In your experience, do the ggml results look OK when you compare to the reference Python implementation?
I'll probably do some more rigorous testing later, but would like to get an additional opinion.

mverrilli · 2023-05-05T19:11:43Z

@ggerganov ggml-org/ggml#132

It is pretty comparable. The Q5_0 is significantly faster for the larger model. I was not getting good results until I added the special token handling since it would split the special token into two. I posted some sample runs in the README.

j-f1 · 2023-05-05T22:11:05Z

nice, sry for ghost posting but why llama.cpp exists if you have the ggml repo?

I am a beginner in terms of AI.

GGML is a general purpose matrix API that doesn’t include support for running specific models directly (I believe). This repo exists to use GGML to implement the specific structures of LLaMA.

devkral · 2023-05-11T07:08:51Z

The ggml dolly example results look good.

xingchensong · 2023-05-25T01:49:31Z

Hi teams， any update？

devkral · 2023-05-25T09:11:38Z

wrong repository

xingchensong · 2023-05-25T10:21:24Z

wrong repository

is ggml the right one？

devkral added 3 commits May 4, 2023 00:02

add support for ByteStorage, relax model glob

b59c371

fix too relaxed model glob (breaking multifile)

2b7cf9f

load pretrained vocab

b63654c

j-f1 reviewed May 4, 2023

View reviewed changes

ggerganov closed this May 4, 2023

devkral mentioned this pull request May 7, 2023

convert.py fails on the dolly model with KeyError: ('torch', 'ByteStorage') #1048

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Steps to support the Dolly model #1308

Steps to support the Dolly model #1308

devkral commented May 3, 2023 •

edited

Loading

j-f1 May 4, 2023

ggerganov commented May 4, 2023

devkral commented May 5, 2023

mverrilli commented May 5, 2023

ggerganov commented May 5, 2023

mverrilli commented May 5, 2023

j-f1 commented May 5, 2023

devkral commented May 11, 2023

xingchensong commented May 25, 2023

devkral commented May 25, 2023

xingchensong commented May 25, 2023

	return f"<SentencePieceVocab with {self.vocab_size_base} base tokens and {len(self.added_tokens_list)} added tokens>"
	return f"<PretrainedVocab with {self.vocab_size_base} base tokens and {len(self.added_tokens_list)} added tokens>"

Steps to support the Dolly model #1308

Steps to support the Dolly model #1308

Conversation

devkral commented May 3, 2023 • edited Loading

What

Why

Remaining issues

j-f1 May 4, 2023

Choose a reason for hiding this comment

ggerganov commented May 4, 2023

devkral commented May 5, 2023

mverrilli commented May 5, 2023

ggerganov commented May 5, 2023

mverrilli commented May 5, 2023

j-f1 commented May 5, 2023

devkral commented May 11, 2023

xingchensong commented May 25, 2023

devkral commented May 25, 2023

xingchensong commented May 25, 2023

devkral commented May 3, 2023 •

edited

Loading