Merge exllama backend into united. #447

pi6am · 2023-08-30T06:23:11Z

Add a new inference model backend based on exllama.
Most of the work on this backend was done by Occam. My main contribution was discovering and working around a bug in torch.multinomial, hooking up stoppers, configuring bad_words_ids, and some other minor bug fixes.

Merge united

Merge united.

The end-of-sequence (</s>) token indicates the end of a generation. When a token sequence containing </s> is decoded, an extra (wrong) space is inserted at the beginning of the generation. To avoid this, strip the eos token out of the result before returning it. The eos token was getting stripped later, so this doesn't change the output except to avoid the spurious leading space.

Strip the eos token from exllama generations.

Add stopper hooks suppport to exllama

There is a bug in PyTorch 2.0.1 that allows torch.multinomial to sometimes choose elements that have zero probability. Since this is uncommon we can continue to use torch.multinomial as long as we verify that the results are valid. If they aren't, try again until the probability of each selected token is positive.

Resample to work around a bug in torch.multinomial

The bos token was already hardcoded as a bad word id. Store badwords in a list and iterate over them during generation. Add the Llama eos token to the list of bad words. Also support "single line mode", which adds newline (13) to badwords.

Add the eos token to exllama bad words.

Read config.json and enable exllama loading if the model has a `quantization_config` with `quant_methdod` of `gptq`. Note that this implementation is limited and only supports model.safetensors. That said, this supports loading popular gptq quantized models without renaming or symlinking the model file.

Modify exllama to load unrenamed gptq quantized models

Merge henk717/united into exllama

Merge branch henk717/united into exllama

Use the value of the use_default_badwordids setting to configure bad_words_ids. Also add square brackets to bad_words_ids if the use_default_badwordids setting is True. Fix an issue with attempting to use the tokenizer too early, and fix an exception populating Lua bridge data when zero tokens are generated, which can now happen if use_default_badwordids is False and the first token generated is EOS.

Hook up use_default_badwordids in exllama

pi6am and others added 20 commits May 3, 2023 22:04

Merge pull request #31 from henk717/united

01c3262

Merge united

Merge pull request #32 from henk717/united

d8d9890

Merge united

Merge pull request #33 from henk717/united

dda5acd

Merge united.

Merge pull request #62 from pi6am/fix/exllama-eos-space

22fd499

Strip the eos token from exllama generations.

Add stopper hooks suppport to exllama

b96d5d8

Merge pull request #63 from pi6am/feat/exllama-stoppers

b1895de

Add stopper hooks suppport to exllama

Merge pull request #64 from pi6am/fix/multinomial-workaround

0d150e4

Resample to work around a bug in torch.multinomial

Add the eos token to exllama bad words.

08ff7c1

The bos token was already hardcoded as a bad word id. Store badwords in a list and iterate over them during generation. Add the Llama eos token to the list of bad words. Also support "single line mode", which adds newline (13) to badwords.

Merge pull request #65 from pi6am/feat/exllama-badwords

812df5e

Add the eos token to exllama bad words.

Merge pull request #66 from pi6am/feat/exllama-config

5229987

Modify exllama to load unrenamed gptq quantized models

Merge branch 'united' into merge/united-exllama

6151cbd

Add exllama dependency back to requirements.

2c48e05

Merge pull request #68 from pi6am/merge/united-exllama

6e64763

Merge henk717/united into exllama

Merge branch 'henk717/united' into merge/united-exllama

b5b0e3f

Merge pull request #69 from pi6am/merge/united-exllama

36f53cc

Merge branch henk717/united into exllama

Merge pull request #70 from pi6am/feat/exllama-unban-eos

fe53cb2

Hook up use_default_badwordids in exllama

henk717 merged commit 036db07 into henk717:united Sep 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge exllama backend into united. #447

Merge exllama backend into united. #447

pi6am commented Aug 30, 2023

Merge exllama backend into united. #447

Merge exllama backend into united. #447

Conversation

pi6am commented Aug 30, 2023