Issue with convert 7B weights in ubuntu 20.04 LTS #1064

kaleavess · 2023-04-19T15:51:07Z

Hi All

I am very new using AI model and I tried a few of translation models like Aplaca.cpp & GPT4AALL however those are base on 7B. I want to run a 30B/65B in my server. I followed the installation guide start with 7B for first trial its without problem till convert the .pth file to .ggml.

Error massage showed below:
llama@llama:~/llama.cpp$ python3 convert.py models/7B/
Loading model file models/7B/consolidated.00.pth
Loading vocab file models/tokenizer.model
Writing vocab...
[ 1/291] Writing tensor tok_embeddings.weight | size 32000 x 4096 | type UnquantizedDataType(name='F16')
[ 2/291] Writing tensor norm.weight | size 4096 | type UnquantizedDataType(name='F32')
[ 3/291] Writing tensor output.weight | size 32000 x 4096 | type UnquantizedDataType(name='F16')
Traceback (most recent call last):
File "/home/llama/llama.cpp/convert.py", line 1149, in <module>
main()
File "/home/llama/llama.cpp/convert.py", line 1144, in main
OutputFile.write_all(outfile, params, model, vocab)
File "/home/llama/llama.cpp/convert.py", line 953, in write_all
for i, ((name, lazy_tensor), ndarray) in enumerate(zip(model.items(), ndarrays)):
File "/home/llama/llama.cpp/convert.py", line 875, in bounded_parallel_map
result = futures.pop(0).result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/llama/llama.cpp/convert.py", line 950, in do_itemv
return lazy_tensor.load().to_ggml().ndarray
File "/home/llama/llama.cpp/convert.py", line 489, in load
ret = self._load()
File "/home/llama/llama.cpp/convert.py", line 497, in load
return self.load().astype(data_type)
File "/home/llama/llama.cpp/convert.py", line 489, in load
ret = self._load()
File "/home/llama/llama.cpp/convert.py", line 695, in load
return UnquantizedTensor(storage.load(storage_offset, elm_count).reshape(size))
File "/home/llama/llama.cpp/convert.py", line 680, in load
fp = self.zip_file.open(info)
File "/usr/lib/python3.10/zipfile.py", line 1535, in open
raise BadZipFile("Bad magic number for file header")
zipfile.BadZipFile: Bad magic number for file header

Machine spec:
CPU: Ryzen 5700G
GPU: RTX 2060 it 12GB
Ram: 32GB

Thank you so much

The text was updated successfully, but these errors were encountered:

Azeirah · 2023-04-19T21:49:59Z

Are you certain the model you downloaded is not corrupt? The error you're getting is that one of the zipfiles is missing a header that all zipfiles should have. Is your download of the 7B model unfinished perhaps? Not sure

prusnak · 2023-04-20T09:01:33Z

Check downloaded files via sha256sum --ignore-missing -c SHA256SUMS

Please reopen if downloaded files are giving correct hashes.

prusnak closed this as not planned Won't fix, can't repro, duplicate, stale Apr 20, 2023

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with convert 7B weights in ubuntu 20.04 LTS #1064

Issue with convert 7B weights in ubuntu 20.04 LTS #1064

kaleavess commented Apr 19, 2023 •

edited

Loading

Azeirah commented Apr 19, 2023 •

edited

Loading

prusnak commented Apr 20, 2023

Issue with convert 7B weights in ubuntu 20.04 LTS #1064

Issue with convert 7B weights in ubuntu 20.04 LTS #1064

Comments

kaleavess commented Apr 19, 2023 • edited Loading

Azeirah commented Apr 19, 2023 • edited Loading

prusnak commented Apr 20, 2023

kaleavess commented Apr 19, 2023 •

edited

Loading

Azeirah commented Apr 19, 2023 •

edited

Loading