error loading model: missing tok_embeddings.weight #1381

realcarlos · 2023-05-09T16:58:32Z

$ ./examples/chat-gpt2.sh
main: build = 480 (f4cef87)
main: seed = 1683650863
llama.cpp: loading model from ./models/ggml-model-gpt2-q4_0.bin
error loading model: missing tok_embeddings.weight
llama_init_from_file: failed to load model
main: error: failed to load model './models/ggml-model-gpt2-q4_0.bin'

The text was updated successfully, but these errors were encountered:

mirek190 · 2023-05-10T22:15:27Z

no compatible ggml model

realcarlos · 2023-05-12T08:53:12Z

no compatible ggml model

I followed this :

quantize the model to 4-bits (using q4_0 method)

./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0

run the inference

./main -m ./models/7B/ggml-model-q4_0.bin -n 128

can not run it succesfully

mirek190 · 2023-05-12T09:53:46Z

I am getting such error when I try to load not compatible model like neon x instead ggjt .
Try koboldcpp ... this one is handling more models and still have one executable file for it.

luoweb · 2023-06-27T16:07:20Z

the same problem, how to fix it?

12:05AM DBG Loading model 'starchat-beta.ggmlv3.q4_0.bin' greedly
12:05AM DBG [llama] Attempting to load
12:05AM DBG Loading model llama from starchat-beta.ggmlv3.q4_0.bin
12:05AM DBG Loading model in memory from file: /Users/block/code/data/models/starchat-beta.ggmlv3.q4_0.bin
llama.cpp: loading model from /Users/block/code/data/models/starchat-beta.ggmlv3.q4_0.bin
error loading model: missing tok_embeddings.weight
llama_init_from_file: failed to load model
12:05AM DBG [llama] Fails: failed loading model
12:05AM DBG [gpt4all] Attempting to load
12:05AM DBG Loading model gpt4all from starchat-beta.ggmlv3.q4_0.bin
12:05AM DBG Loading model in memory from file: /Users/block/code/data/models/starchat-beta.ggmlv3.q4_0.bin

skbylife · 2023-07-25T03:35:20Z

Same problem, anyone can help?

./main -m ./models/7B/ggml-model-q4_0.bin -n 128
main: build = 899 (41c6741)
main: seed = 1690255993
llama.cpp: loading model from ./models/7B/ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_head_kv = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: n_gqa = 1
llama_model_load_internal: rnorm_eps = 1.0e-06
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: freq_base = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.00 MB
error loading model: llama.cpp: tensor 'tok_embeddings.weight' is missing from model
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/7B/ggml-model-q4_0.bin'
main: error: unable to load model

goerch · 2023-07-25T11:59:33Z

I see at least 2 different models, probably corresponding to different branches in examples. Do we have some regression testing in place for these?

@realcarlos : main: build = 480 seems pretty old. I'd assume your problem is solved?
@luoweb; llama_model_load_internal: format is missing from your report. I'd assume you are using an old format?
@skbylife: could you please tell us about your base model?

lishaung99 · 2023-08-18T01:21:36Z

i have the same problem, anyone can help?and What is the solution to this one?

main: build = 992 (0919a0f)
main: seed = 1692321231
llama.cpp: loading model from D:\Work\llama2\llama.cpp\org-models\7B\ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_head_kv = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: n_gqa = 1
llama_model_load_internal: rnorm_eps = 5.0e-06
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: freq_base = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.00 MB
error loading model: llama.cpp: tensor 'tok_embeddings.weight' is missing from model
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'D:\Work\llama2\llama.cpp\org-models\7B\ggml-model-q4_0.bin'
main: error: unable to load model

github-actions · 2024-04-09T01:09:32Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions bot added the stale label Mar 25, 2024

github-actions bot closed this as completed Apr 9, 2024

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error loading model: missing tok_embeddings.weight #1381

error loading model: missing tok_embeddings.weight #1381

realcarlos commented May 9, 2023

mirek190 commented May 10, 2023 •

edited

Loading

realcarlos commented May 12, 2023

mirek190 commented May 12, 2023

luoweb commented Jun 27, 2023

skbylife commented Jul 25, 2023

goerch commented Jul 25, 2023 •

edited

Loading

lishaung99 commented Aug 18, 2023

github-actions bot commented Apr 9, 2024

error loading model: missing tok_embeddings.weight #1381

error loading model: missing tok_embeddings.weight #1381

Comments

realcarlos commented May 9, 2023

mirek190 commented May 10, 2023 • edited Loading

realcarlos commented May 12, 2023

quantize the model to 4-bits (using q4_0 method)

run the inference

mirek190 commented May 12, 2023

luoweb commented Jun 27, 2023

skbylife commented Jul 25, 2023

goerch commented Jul 25, 2023 • edited Loading

lishaung99 commented Aug 18, 2023

github-actions bot commented Apr 9, 2024

mirek190 commented May 10, 2023 •

edited

Loading

goerch commented Jul 25, 2023 •

edited

Loading