Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py : add Gemma conversion from HF models #5647

Merged
merged 4 commits into from
Feb 22, 2024
Merged

Conversation

ggerganov
Copy link
Owner

# gemma-2b
python3 convert-hf-to-gguf.py ~/Data/huggingface/gemma-2b/ --outfile models/gemma-2b/ggml-model-f16.gguf --outtype f16

# gemma-7b
python3 convert-hf-to-gguf.py ~/Data/huggingface/gemma-7b/ --outfile models/gemma-7b/ggml-model-f16.gguf --outtype f16

@ggerganov ggerganov requested a review from cebtenzzre February 21, 2024 20:52
@ggerganov ggerganov added the need feedback Testing and feedback with results are needed label Feb 21, 2024
@ggerganov ggerganov mentioned this pull request Feb 21, 2024
@twoxfh
Copy link

twoxfh commented Feb 21, 2024

I successfully created a 2b GGUF and loaded model with the server in master. Thanks!

convert-hf-to-gguf.py Outdated Show resolved Hide resolved
convert-hf-to-gguf.py Outdated Show resolved Hide resolved
ggerganov and others added 2 commits February 22, 2024 11:27
Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Aarni Koskela <akx@iki.fi>
Copy link

@Yefori-Go Yefori-Go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works well for me

@postmasters
Copy link
Contributor

I notice that the HF config.json says there are 256000 tokens. But the embedding layer is 256128 x d_model. Not sure if there would be latent issues later.

@ggerganov
Copy link
Owner Author

I noticed that as well, but I think the actual tensor shape in the safetensors files is [2048, 256000], instead of [2048, 256128] (Gemma-2B), leading to discrepancy with the published FP32 GGUF files. So we probably have to pad with 0s? But what would be the point of that? Not sure - would be helpful if we get some more eyes on this

@postmasters
Copy link
Contributor

postmasters commented Feb 22, 2024

I just checked. The original internal checkpoint uses [256128, 3072] for 7B. Perhaps the The conversion from that checkpoint to SafeTensors has dropped the last hundred tokens.

convert-hf-to-gguf.py Outdated Show resolved Hide resolved
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
@ggerganov ggerganov merged commit 847eedb into master Feb 22, 2024
42 of 49 checks passed
@ggerganov ggerganov deleted the gg/add-gemma-conversion branch February 22, 2024 21:22
@Ronnie-Leon76
Copy link

image
I have tried quantizing a fine-tuned gemma-7b model that was loaded as 4-bit but I get the error: Can not map tensor 'model.layers.0.mlp.down_proj.weight.absmax' above. @ggerganov I'll appreciate it if you help me resolve this issue.

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
* py : add gemma conversion from HF models

* Update convert-hf-to-gguf.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update convert-hf-to-gguf.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update convert-hf-to-gguf.py

Co-authored-by: Jared Van Bortel <jared@nomic.ai>

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* py : add gemma conversion from HF models

* Update convert-hf-to-gguf.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update convert-hf-to-gguf.py

Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update convert-hf-to-gguf.py

Co-authored-by: Jared Van Bortel <jared@nomic.ai>

---------

Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
@diogo-garcia
Copy link

image I have tried quantizing a fine-tuned gemma-7b model that was loaded as 4-bit but I get the error: Can not map tensor 'model.layers.0.mlp.down_proj.weight.absmax' above. @ggerganov I'll appreciate it if you help me resolve this issue.

I recently downloaded the Meta-Llama-3-8b model from huggingface and attempted to convert it to GGUF format using the following command:

python3 /root/llama.cpp/convert-hf-to-gguf.py /root/models/meta-llama-3-8b --outfile /root/models/meta-llama-3-8b.gguf --outtype f32

However, I encountered the same error. The specific error message I received was:

File "/root/llama.cpp/convert-hf-to-gguf.py", line 182, in map_tensor_name
raise ValueError(f"Can not map tensor {name!r} {try_suffixes} {self.tensor_map}")
ValueError: Can not map tensor 'model.layers.0.mlp.down_proj.weight.absmax'

I also tried modifying the function map_tensor_name in the script to change the suffixes from .weight to .weight_map, which resolved the previous error, but now I am getting a new error:

File "/root/llama.cpp/convert-hf-to-gguf.py", line 182, in map_tensor_name
raise ValueError(f"Can not map tensor {name!r} {try_suffixes} {self.tensor_map}")
ValueError: Can not map tensor 'model.embed_tokens.weight'

The reason I made this change is that the structure of model.safetensors.index.json indicates that the weights are mapped with suffixes like .weight_map. Here is an excerpt from the JSON file for reference:

{
"metadata": {
"total_size": 6027779904
},
"weight_map": {
"lm_head.weight": "model-00002-of-00002.safetensors",
"model.embed_tokens.weight": "model-00001-of-00002.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.down_proj.weight.absmax": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.down_proj.weight.quant_map": "model-00001-of-00002.safetensors",
...

Could anyone please provide guidance on how to properly map these tensors, or if there's a different approach needed for this conversion? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need feedback Testing and feedback with results are needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants