Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using non LoRA Alpaca model #303

Closed
mjorgers opened this issue Mar 19, 2023 · 11 comments
Closed

Using non LoRA Alpaca model #303

mjorgers opened this issue Mar 19, 2023 · 11 comments
Labels
model Model specific question Further information is requested

Comments

@mjorgers
Copy link

The following repo contains a recreation of the original weights for Alpaca, without using LoRA. How could we use that model with this project? https://github.com/pointnetwork/point-alpaca
Thanks a bunch!

@thomasantony
Copy link

You should theoretically be able to run the same convert and quantize scripts on that model and use them with llama.cpp.

@gjmulder gjmulder added question Further information is requested wontfix This will not be worked on model Model specific labels Mar 20, 2023
@clulece
Copy link

clulece commented Mar 20, 2023

You should theoretically be able to run the same convert and quantize scripts on that model and use them with llama.cpp.

I tried to convert the recreated weights using the convert script but got the following error:

TypeError: Got unsupported ScalarType BFloat16

Be forewarned, I have absolutely no clue what I'm doing. I'm working on changing this, but in the mean time I - and many others I imagine - would really appreciated guidance from those with the required know-how.

FYI, the reciprocal of this question over at point-alpaca can be found here: pointnetwork/point-alpaca#3

@adntaha
Copy link

adntaha commented Mar 20, 2023

Heya, do you mind laying out the steps you've done to get where you are now? I'm trying to do the same thing but can't get passed the initial making-a-params-json-from-the-config-json hurdle

@FNsi
Copy link
Contributor

FNsi commented Mar 21, 2023

Sorry but do some one know how to merge the lora to the raw model?

@gjmulder gjmulder removed the wontfix This will not be worked on label Mar 21, 2023
@clulece
Copy link

clulece commented Mar 22, 2023

Heya, do you mind laying out the steps you've done to get where you are now? I'm trying to do the same thing but can't get passed the initial making-a-params-json-from-the-config-json hurdle

I just took the "decrypted" pytorch_model-*-of-00003.bin files, put them in models/7B, renamed them so that their names would align with what the scripts in this repo expect, and then ran the standard scripts unmodified.

Like I said, I'm pretty clueless when it comes to deep learning and what formats/conventions they use. I'll keep aimlessly banging my head against this until the non LoRA alpaca model works with llama.cpp. gjmulder removed the wontfix tag which I take as indication that proper support may be implemented.

In the unlikely case that I manage to get it working before official support is added, I promise to post how here.

@adntaha
Copy link

adntaha commented Mar 23, 2023

Heya, I've figured it out! I took Alpaca-LoRA's export_state_dict_checkpoint.py and adapted it a bit to fit our use case! Here's a link to my tweaked version: https://gist.github.com/botatooo/7ab9aa95eab61d1b64edc0263453230a

Steps:

  • Download tweaked export_state_dict_checkpoint.py and move it into point-alpaca's directory
  • Run it using python export_state_dict_checkpoint.py
  • Once it's done, you'll want to
    • create a new directory, i'll call it palpaca
    • rename ckpt to 7B and move it into the new directory
    • copy tokenizer.model from results into the new directory.

Your directory structure should now look something like this:

chat.py
encrypt.py
[... other point-alpaca files ...]
palpaca/
    7B/
        consolidated.00.pth
        params.json
    tokenizer.model

Note: You'll want to wait until #428 gets merged, or fix the quantize script yourself

now you can move palpaca into the llama.cpp folder and you can quantize it as you usually would

python3 convert-pth-to-ggml.py palpaca/7B/ 1
python3 quantize.py -m palpaca 7B

# start inferencing!
./main -m ./palpaca/7B/ggml-model-q4_0.bin --color -f ./prompts/alpaca.txt -ins

I hope this was clear enough

@srhm-ca
Copy link

srhm-ca commented Mar 24, 2023

./main -m ./palpaca/7B/ggml-model-q4_0.bin --color -f ./prompts/alpaca.txt -ins

I was with you until this step. I'm receiving the below:

llama_model_load: loading model part 1/1 from './alpaca/7B/ggml-model-q4_0.bin'
llama_model_load: llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file
llama_init_from_file: failed to load model
main: error: failed to load model './alpaca/7B/ggml-model-q4_0.bin'

This issue suggests I should recompile, which I've done. Is the issue with the below?

llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1

@adntaha
Copy link

adntaha commented Mar 25, 2023

huhhhhhh, I've just tried using lora & it worked, however i got the same error you've gotten when trying with pointalpaca & the tweaked script, so i wonder if it has something to do with the arguments passed to from_pretrained..

@anzz1
Copy link
Contributor

anzz1 commented Mar 25, 2023

@sr-hm @botatooo

The cause of this is that the point-alpaca model in question has a added "[PAD]" token so the resulting model contains 32001 tokens, but the vocab size was set to 32000, resulting in a mismatch between the tensor shapes and number of tokens. you can edit the params and change it to 32001, but then it crashes (tokenizer.model does not have this token, the value isn't supported because its not divisible by 256 , also i don't think you can have uneven numbers because the bytepair logic needs them to be n / 2?).

i "fixed" it by truncating the shape to 32000 thus yeeting the added token out of existence. it seems to work fine and the token is probably not used anyway, but if it is there is a chance the output could be affected in some way.

ugly hack for point-alpaca: https://gist.github.com/anzz1/6c0b38a1593879065b364bc02f2d3de4

@xor2003
Copy link

xor2003 commented Apr 2, 2023

I have developed
the following prompt and script use this model:

 Text transcript of a never ending dialog, where ${USER_NAME} interacts with an AI assistant named ${AI_NAME}.
${AI_NAME} is helpful, kind, honest, friendly, good at writing and never fails to answer ${USER_NAME}’s requests immediately and with details and precision.
There are no annotations like (30 seconds passed...) or (to himself), just what ${USER_NAME} and ${AI_NAME} say aloud to each other.
The dialog lasts for years, the entirety of it is shared below. It's 10000 pages long.
If you are a doctor, please answer the medical questions based on the patient's description.

Doctor: I am Doctor, what medical questions do you have?

chat-doctor.tar.gz

@larawehbe
Copy link

Heya, I've figured it out! I took Alpaca-LoRA's export_state_dict_checkpoint.py and adapted it a bit to fit our use case! Here's a link to my tweaked version: https://gist.github.com/botatooo/7ab9aa95eab61d1b64edc0263453230a

Steps:

  • Download tweaked export_state_dict_checkpoint.py and move it into point-alpaca's directory

  • Run it using python export_state_dict_checkpoint.py

  • Once it's done, you'll want to

    • create a new directory, i'll call it palpaca
    • rename ckpt to 7B and move it into the new directory
    • copy tokenizer.model from results into the new directory.

Your directory structure should now look something like this:

chat.py
encrypt.py
[... other point-alpaca files ...]
palpaca/
    7B/
        consolidated.00.pth
        params.json
    tokenizer.model

Note: You'll want to wait until #428 gets merged, or fix the quantize script yourself

now you can move palpaca into the llama.cpp folder and you can quantize it as you usually would

python3 convert-pth-to-ggml.py palpaca/7B/ 1
python3 quantize.py -m palpaca 7B

# start inferencing!
./main -m ./palpaca/7B/ggml-model-q4_0.bin --color -f ./prompts/alpaca.txt -ins

I hope this was clear enough

Amazing!
Where did you get tokenizer.model from ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model Model specific question Further information is requested
Projects
None yet
Development

No branches or pull requests