Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPT4All: invalid model file (bad magic) #662

Closed
doomguy opened this issue Mar 31, 2023 · 10 comments
Closed

GPT4All: invalid model file (bad magic) #662

doomguy opened this issue Mar 31, 2023 · 10 comments

Comments

@doomguy
Copy link

doomguy commented Mar 31, 2023

Hi there, followed the instructions to get gpt4all running with llama.cpp, but was somehow unable to produce a valid model using the provided python conversion scripts:

% python3 convert-gpt4all-to-ggml.py models/gpt4all-7B/gpt4all-lora-quantized.bin ./models/tokenizer.model
converting models/gpt4all-7B/gpt4all-lora-quantized.bin
% ./main -m ./models/gpt4all-7B/gpt4all-lora-quantized.bin -n 128
main: seed = 1680294943
llama_model_load: loading model from './models/gpt4all-7B/gpt4all-lora-quantized.bin' - please wait ...
./models/gpt4all-7B/gpt4all-lora-quantized.bin: invalid model file (bad magic [got 0x67676d66 want 0x67676a74])
	you most likely need to regenerate your ggml files
	the benefit is you'll get 10-100x faster load times
	see https://github.com/ggerganov/llama.cpp/issues/91
	use convert-pth-to-ggml.py to regenerate from original pth
	use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model
main: error: failed to load model './models/gpt4all-7B/gpt4all-lora-quantized.bin'

Are just the magic bytes in the python script wrong or is it a completely different format?

Related issues: #647

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

Below shows that your file has wrong format. My file works with this repo so if the file is proper it will work from #103 (comment) .
llama_model_load: loading model from './models/gpt4all-7B/gpt4all-lora-quantized.bin' - please wait ...
./models/gpt4all-7B/gpt4all-lora-quantized.bin: invalid model file (bad magic [got 0x67676d66 want 0x67676a74])

@doomguy
Copy link
Author

doomguy commented Mar 31, 2023

Hi @CoderRC,

thanks for your reply.

can you please share your gpt4all sha 256 sums?:

% shasum -a 256 gpt4all-lora-quantized.bin *
d9af98b0350fc8af7211097e816ffbb8bae9a18f8aea8c50ff94a99bd6cb2c7c  gpt4all-lora-quantized.bin
05c9dc0a4904f3b232cffe717091b0b0a8246f49c3f253208fbf342ed79a6122  gpt4all-lora-quantized.bin.orig

@leonardohn
Copy link
Contributor

You need to use convert-gpt4all-to-ggml.py first and then migrate-ggml-2023-03-30-pr613.py.

@nlpander
Copy link

yes I'm getting the same issue as @doomguy I built llama using cmake and with no posix additions, could this be the source of the error ? I also seem to have the same error (bad file magic) when I attempt to quantize the 30B model. I've tried a few different copies and re-downlading but no luck.

@nlpander
Copy link

You need to use convert-gpt4all-to-ggml.py first and then migrate-ggml-2023-03-30-pr613.py.

hmmm still having issues - Failed loading model

@doomguy
Copy link
Author

doomguy commented Mar 31, 2023

Hi @leonardohn,

thanks - that did the trick.

That python script just went under my radar.

% python3 migrate-ggml-2023-03-30-pr613.py models/gpt4all-7B/gpt4all-lora-quantized.bin  models/gpt4all-7B/gpt4all-lora-quantized_ggjt.bin
Processing part 1 of 1

Processing tensor b'tok_embeddings.weight' with shape: [32001, 4096] and type: Q4_0
Processing tensor b'layers.0.attention.wq.weight' with shape: [4096, 4096] and type: Q4_0
Processing tensor b'layers.0.attention.wk.weight' with shape: [4096, 4096] and type: Q4_0
Processing tensor b'layers.0.attention.wv.weight' with shape: [4096, 4096] and type: Q4_0
...
Processing tensor b'output.weight' with shape: [32001, 4096] and type: Q4_0
Done. Output file: models/gpt4all-7B/gpt4all-lora-quantized_ggjt.bin
% ./main -m models/gpt4all-7B/gpt4all-lora-quantized_ggjt.bin -p "it is a good idea to change file formats often because"

@doomguy doomguy closed this as completed Mar 31, 2023
@leonardohn
Copy link
Contributor

@doomguy I had the same issue yesterday, after they introduced the breaking change. It is still not in the README, but the changes seem to be for a good reason.

@nlpander That instruction is for the GPT4All-7B model. I guess the 30B model is on a different version of ggml, so you could try using the other conversion scripts.

@Freshbytes
Copy link

You need to use convert-gpt4all-to-ggml.py first and then migrate-ggml-2023-03-30-pr613.py.

The files [convert-gpt4all-to-ggml.py] and [migrate-ggml-2023-03-30-pr613.py] don't exist anymore.
Do you know what can be done? Some of us still get the error

llama_model_load: loading model from './gpt4all-lora-quantized-ggml.bin' - please wait ...
./gpt4all-lora-quantized-ggml.bin: invalid model file (bad magic [got 0x67676d66 want 0x67676a74])
        you most likely need to regenerate your ggml files
        the benefit is you'll get 10-100x faster load times
        see https://github.com/ggerganov/llama.cpp/issues/91
        use convert-pth-to-ggml.py to regenerate from original pth
        use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model

@leonardohn
Copy link
Contributor

@Freshbytes you can fetch them from a previous commit. I think the scripts are both self-contained, so the changes on the main project shouldn't affect them.

@noamsiegel
Copy link

breaking change

Did you find a fix for this? I am trying to use gpt4all models (snoozy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants