Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Target module QuantLinear() is not supported. #1

Open
osbm opened this issue May 6, 2024 · 0 comments
Open

ValueError: Target module QuantLinear() is not supported. #1

osbm opened this issue May 6, 2024 · 0 comments

Comments

@osbm
Copy link

osbm commented May 6, 2024

I am trying to run. examples/finetune.py script but. It is giving me this error

ValueError: Target module QuantLinear() is not supported. Currently, only the following modules are supported:

/home/osbm/Documents/temp/MODULoRA-Experiment/examples
loading configuration file ./llama-7b-quantized/config.json
Model config LlamaConfig {
  "_name_or_path": "./llama-7b-quantized",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "eos_token_id": 1,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 11008,
  "max_position_embeddings": 2048,
  "max_sequence_length": 2048,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "pad_token_id": -1,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 10000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.40.1",
  "use_cache": true,
  "vocab_size": 32000
}

loading configuration file ./llama-7b-quantized/config.json
Model config LlamaConfig {
  "_name_or_path": "baffo32/decapoda-research-llama-7B-hf",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "eos_token_id": 1,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 11008,
  "max_position_embeddings": 2048,
  "max_sequence_length": 2048,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "pad_token_id": -1,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 10000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.40.1",
  "use_cache": true,
  "vocab_size": 32000
}

Generate config GenerationConfig {
  "bos_token_id": 0,
  "eos_token_id": 1,
  "pad_token_id": -1
}

/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py:4371: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
  warnings.warn(
The cos_cached attribute will be removed in 4.39. Bear in mind that its contents changed in v4.38. Use the forward method of RoPE from now on instead. It is not used in the `LlamaAttention` class
The sin_cached attribute will be removed in 4.39. Bear in mind that its contents changed in v4.38. Use the forward method of RoPE from now on instead. It is not used in the `LlamaAttention` class
/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
loading file tokenizer.model from cache at /home/osbm/.cache/huggingface/hub/models--huggyllama--llama-13b/snapshots/bf57045473f207bb1de1ed035ace226f4d9f9bba/tokenizer.model
loading file tokenizer.json from cache at /home/osbm/.cache/huggingface/hub/models--huggyllama--llama-13b/snapshots/bf57045473f207bb1de1ed035ace226f4d9f9bba/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at /home/osbm/.cache/huggingface/hub/models--huggyllama--llama-13b/snapshots/bf57045473f207bb1de1ed035ace226f4d9f9bba/special_tokens_map.json
loading file tokenizer_config.json from cache at /home/osbm/.cache/huggingface/hub/models--huggyllama--llama-13b/snapshots/bf57045473f207bb1de1ed035ace226f4d9f9bba/tokenizer_config.json
Traceback (most recent call last):
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/examples/finetune.py", line 94, in <module>
    model = quant_peft.get_peft_model(llm, lora_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/peft/mapping.py", line 149, in get_peft_model
    return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config, adapter_name=adapter_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/peft/peft_model.py", line 1360, in __init__
    super().__init__(model, peft_config, adapter_name)
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/peft/peft_model.py", line 138, in __init__
    self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/peft/tuners/lora/model.py", line 138, in __init__
    super().__init__(model, config, adapter_name)
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/peft/tuners/tuners_utils.py", line 166, in __init__
    self.inject_adapter(self.model, adapter_name)
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/peft/tuners/tuners_utils.py", line 372, in inject_adapter
    self._create_and_replace(peft_config, adapter_name, target, target_name, parent, current_key=key)
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/peft/tuners/lora/model.py", line 222, in _create_and_replace
    new_module = self._create_new_module(lora_config, adapter_name, target, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/osbm/Documents/temp/MODULoRA-Experiment/.venv/lib/python3.12/site-packages/peft/tuners/lora/model.py", line 319, in _create_new_module
    raise ValueError(
ValueError: Target module QuantLinear() is not supported. Currently, only the following modules are supported: `torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv2d`, `transformers.pytorch_utils.Conv1D`.

I am using python 3.12.2 with these modules:

transformers==4.35.2
peft==0.6.2
torch==2.3.0
tokenizers==0.15.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant