Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error loading model: missing tok_embeddings.weight #1381

Closed
realcarlos opened this issue May 9, 2023 · 8 comments
Closed

error loading model: missing tok_embeddings.weight #1381

realcarlos opened this issue May 9, 2023 · 8 comments
Labels

Comments

@realcarlos
Copy link

$ ./examples/chat-gpt2.sh
main: build = 480 (f4cef87)
main: seed = 1683650863
llama.cpp: loading model from ./models/ggml-model-gpt2-q4_0.bin
error loading model: missing tok_embeddings.weight
llama_init_from_file: failed to load model
main: error: failed to load model './models/ggml-model-gpt2-q4_0.bin'

@mirek190
Copy link

mirek190 commented May 10, 2023

no compatible ggml model

@realcarlos
Copy link
Author

no compatible ggml model

I followed this :

quantize the model to 4-bits (using q4_0 method)

./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0

run the inference

./main -m ./models/7B/ggml-model-q4_0.bin -n 128

can not run it succesfully

@mirek190
Copy link

I am getting such error when I try to load not compatible model like neon x instead ggjt .
Try koboldcpp ... this one is handling more models and still have one executable file for it.

@luoweb
Copy link

luoweb commented Jun 27, 2023

the same problem, how to fix it?

12:05AM DBG Loading model 'starchat-beta.ggmlv3.q4_0.bin' greedly
12:05AM DBG [llama] Attempting to load
12:05AM DBG Loading model llama from starchat-beta.ggmlv3.q4_0.bin
12:05AM DBG Loading model in memory from file: /Users/block/code/data/models/starchat-beta.ggmlv3.q4_0.bin
llama.cpp: loading model from /Users/block/code/data/models/starchat-beta.ggmlv3.q4_0.bin
error loading model: missing tok_embeddings.weight
llama_init_from_file: failed to load model
12:05AM DBG [llama] Fails: failed loading model
12:05AM DBG [gpt4all] Attempting to load
12:05AM DBG Loading model gpt4all from starchat-beta.ggmlv3.q4_0.bin
12:05AM DBG Loading model in memory from file: /Users/block/code/data/models/starchat-beta.ggmlv3.q4_0.bin

@skbylife
Copy link

Same problem, anyone can help?

./main -m ./models/7B/ggml-model-q4_0.bin -n 128
main: build = 899 (41c6741)
main: seed = 1690255993
llama.cpp: loading model from ./models/7B/ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_head_kv = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: n_gqa = 1
llama_model_load_internal: rnorm_eps = 1.0e-06
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: freq_base = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.00 MB
error loading model: llama.cpp: tensor 'tok_embeddings.weight' is missing from model
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/7B/ggml-model-q4_0.bin'
main: error: unable to load model

@goerch
Copy link
Collaborator

goerch commented Jul 25, 2023

I see at least 2 different models, probably corresponding to different branches in examples. Do we have some regression testing in place for these?

@realcarlos : main: build = 480 seems pretty old. I'd assume your problem is solved?
@luoweb; llama_model_load_internal: format is missing from your report. I'd assume you are using an old format?
@skbylife: could you please tell us about your base model?

@lishaung99
Copy link

i have the same problem, anyone can help?and What is the solution to this one?

main: build = 992 (0919a0f)
main: seed = 1692321231
llama.cpp: loading model from D:\Work\llama2\llama.cpp\org-models\7B\ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_head_kv = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: n_gqa = 1
llama_model_load_internal: rnorm_eps = 5.0e-06
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: freq_base = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.00 MB
error loading model: llama.cpp: tensor 'tok_embeddings.weight' is missing from model
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'D:\Work\llama2\llama.cpp\org-models\7B\ggml-model-q4_0.bin'
main: error: unable to load model

@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link
Contributor

github-actions bot commented Apr 9, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants