Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

13b model issue tensor 'tok_embeddings.weight' has wrong size in model file #24

Closed
Tarang opened this issue Mar 11, 2023 · 4 comments
Closed
Labels
build Compilation issues

Comments

@Tarang
Copy link

Tarang commented Mar 11, 2023

I try the following with the latest master (6b2cb63)

python convert-pth-to-ggml.py models/13B/ 1
./quantize ./models/13B/ggml-model-f16.bin   ./models/13B/ggml-model-q4_0.bin 2
./quantize ./models/13B/ggml-model-f16.bin.1 ./models/13B/ggml-model-q4_0.bin.1 2
ls models/13B/
checklist.chk         consolidated.00.pth   consolidated.01.pth   ggml-model-f16.bin    ggml-model-f16.bin.1  ggml-model-q4_0.bin   ggml-model-q4_0.bin.1 params.json
./main -m ./models/13B/ggml-model-q4_0.bin -t 8 -n 128
main: seed = 1678568386
llama_model_load: loading model from './models/13B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 5120
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 40
llama_model_load: n_layer = 40
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 13824
llama_model_load: ggml ctx size = 8559.49 MB
llama_model_load: memory_size =   800.00 MB, n_mem = 20480
llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file
main: failed to load model from './models/13B/ggml-model-q4_0.bin'
llama_model_load: ⏎                                                                                                                                                                                       

What would tensor 'tok_embeddings.weight' has wrong size in model file mean?

@djkz
Copy link

djkz commented Mar 11, 2023

It means you are running it on the old version. recompile your main and quantize and re-quantize the weights, it should work after.

@Tarang
Copy link
Author

Tarang commented Mar 11, 2023

That was it!

@Ionaut
Copy link

Ionaut commented Mar 21, 2023

How is this done exactly?

@Komal-99
Copy link

Hi,
I am not able to Quantize my model after running convert.py from Llama.cpp the mode has been converted into gguf type but while running. '
./quantize C:\PrivateGPT\privategpt\privateGPT-main\llama.cpp-master\models\ggml-model-f16.gguf C:\PrivateGPT\privategpt\privateGPT-main\llama.cpp-master\models\ggml-model-q4_0.gguf q4_0

Error Occured :- ./quantize is not a cmdlet or script function.
Also @Tarang Can you please tell how you are able to create .bin file convert.py is creating .gguf file by default.

Deadsg pushed a commit to Deadsg/llama.cpp that referenced this issue Dec 19, 2023
zkh2016 pushed a commit to zkh2016/llama.cpp that referenced this issue Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Compilation issues
Projects
None yet
Development

No branches or pull requests

5 participants