-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gemma2 GGUF: modeling_gguf_pytorch_utils.py: ValueError: Architecture gemma2 not supported
#32577
Comments
I think you need to open a PR to add a gemma2<>gguf tensor name mapping, in this code: transformers/src/transformers/integrations/ggml.py Lines 74 to 120 in 48101cf
|
@julien-c thanks, I'm able to load the model with this patch: --- a/src/transformers/integrations/ggml.py
+++ b/src/transformers/integrations/ggml.py
@@ -117,6 +117,23 @@ GGUF_TENSOR_MAPPING = {
"output.weight": "lm_head.weight",
"output_norm": "model.norm",
},
+ "gemma2": {
+ "token_embd": "model.embed_tokens",
+ "blk": "model.layers",
+ "ffn_up": "mlp.up_proj",
+ "ffn_down": "mlp.down_proj",
+ "ffn_gate": "mlp.gate_proj",
+ "ffn_norm": "post_attention_layernorm",
+ "post_ffw_norm": "post_feedforward_layernorm",
+ "post_attention_norm": "pre_feedforward_layernorm",
+ "attn_norm": "input_layernorm",
+ "attn_q": "self_attn.q_proj",
+ "attn_v": "self_attn.v_proj",
+ "attn_k": "self_attn.k_proj",
+ "attn_output": "self_attn.o_proj",
+ "output.weight": "lm_head.weight",
+ "output_norm": "model.norm",
+ },
}
@@ -161,6 +178,18 @@ GGUF_CONFIG_MAPPING = {
"attention.layer_norm_rms_epsilon": "rms_norm_eps",
"vocab_size": "vocab_size",
},
+ "gemma2": {
+ "context_length": "max_position_embeddings",
+ "block_count": "num_hidden_layers",
+ "feed_forward_length": "intermediate_size",
+ "embedding_length": "hidden_size",
+ "rope.dimension_count": None,
+ "rope.freq_base": "rope_theta",
+ "attention.head_count": "num_attention_heads",
+ "attention.head_count_kv": "num_key_value_heads",
+ "attention.layer_norm_rms_epsilon": "rms_norm_eps",
+ "vocab_size": "vocab_size",
+ },
"tokenizer": {
"ggml.bos_token_id": "bos_token_id",
"ggml.eos_token_id": "eos_token_id", However, I can't get the model together with tokenizer to produce a meaningful output. If I load tokenizer like in the example code at https://huggingface.co/docs/transformers/main/gguf : from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "bartowski/gemma-2-2b-it-GGUF"
filename = "gemma-2-2b-it-Q6_K.gguf"
tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)
input_text = "What is your name?"
input_ids = tokenizer.encode(input_text, return_tensors='pt')
generated_ids = model.generate(input_ids, max_length=30)
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(generated_text) I get this error:
If I use tokenizer directly from from os import environ
environ['HF_TOKEN'] = '<my token>'
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "bartowski/gemma-2-2b-it-GGUF"
filename = "gemma-2-2b-it-Q6_K.gguf"
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)
input_text = "What is your name?"
input_ids = tokenizer.encode(input_text, return_tensors='pt')
generated_ids = model.generate(input_ids, max_length=30)
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(generated_text) The output is:
|
Hey @alllexx88, you also need to define the tokenizer for gemma 2. Have a look at how qwen2 gguf was added : https://github.com/huggingface/transformers/pull/31175/files |
May I submit a PR for this issue? @alllexx88 @SunMarc |
If you have a working solution @PolRF, feel free to submit a PR ! |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
I'm leaving this issue closed as we centralized gguf model addition request in this issue ! #33260 |
@SunMarc I got a similar error while using vllm to deploy chatglm4-gguf: |
System Info
transformers
version: 4.44.0Who can help?
@SunMarc
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run this script:
Expected behavior
Should load model, but fails instead:
The text was updated successfully, but these errors were encountered: