all-linear + classification models have double-wrapped linear layers #1485

nbroad1881 · 2024-02-19T16:28:26Z

System Info

all

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

m = AutoModelForTokenClassification.from_pretrained("microsoft/deberta-v3-xsmall", num_labels=10)

config = LoraConfig(
        r=16,
        lora_alpha=32,
        lora_dropout=0.1,
        task_type="TOKEN_CLS",
        target_modules="all-linear",
        bias="none",
    )

pmod = get_peft_model(m, config)

merged = pmod.merge_and_unload()

merged will still have lora layers in the classifier because it gets double-wrapped due to "all-linear"

Expected behavior

Merged shouldn't have lora layers still in the classifier because calling merged.save_pretrained will result in uninitialized weights upon loading again

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2024-02-19T16:43:36Z

Thanks a lot Nicholas. Adding more context from our internal discussion: With the given LoraConfig, we get a PEFT model that has a ModulesToSaveWrapper on top of a lora.Linear layer:

      ...
      (classifier): ModulesToSaveWrapper(
        (original_module): lora.Linear(
          (base_layer): Linear(in_features=384, out_features=10, bias=True)
          (lora_dropout): ModuleDict(
            (default): Dropout(p=0.1, inplace=False)
          )
          (lora_A): ModuleDict(
            (default): Linear(in_features=384, out_features=16, bias=False)
          )
          (lora_B): ModuleDict(
            (default): Linear(in_features=16, out_features=10, bias=False)
          )
          (lora_embedding_A): ParameterDict()
          (lora_embedding_B): ParameterDict()
        )
        (modules_to_save): ModuleDict(
          (default): lora.Linear(
            (base_layer): Linear(in_features=384, out_features=10, bias=True)
            (lora_dropout): ModuleDict(
              (default): Dropout(p=0.1, inplace=False)
            )
            (lora_A): ModuleDict(
              (default): Linear(in_features=384, out_features=16, bias=False)
            )
            (lora_B): ModuleDict(
              (default): Linear(in_features=16, out_features=10, bias=False)
            )
            (lora_embedding_A): ParameterDict()
            (lora_embedding_B): ParameterDict()
          )
        )
      )

This is already undesired and we should not allow this. To fix this, we can:

Ensure that we fully unwrap the layers.

Add more checks to

peft/src/peft/tuners/tuners_utils.py

Lines 636 to 639 in 7b7e4b2

    
           output_emb = model.get_output_embeddings() 
        
           if output_emb is not None: 
        
               last_module_name = [name for name, module in model.named_modules() if module is output_emb][0] 
        
               linear_module_names -= {last_module_name}

to remove "classifier" layers when the task type is classification.

In addition, I think it's dubious to have a ModulesToSaveWrapper on top of a LoraLayer. I wonder if we should raise an error right there when we encounter this, or is there any scenario where this is desired?

Ping @younesbelkada @pacman100

Resolves huggingface#1485, but note that some additional solutions are mentioned in thet issue. This checks that when unloading a PEFT model, if the ModulesToSaveWrapper contains a tuner module, it is correctly unloaded. The unloaded model should not have PEFT layers at the end.

Resolves #1485, but note that some additional solutions are mentioned in thet issue. This checks that when unloading a PEFT model, if the ModulesToSaveWrapper contains a tuner module, it is correctly unloaded. The unloaded model should not have PEFT layers at the end.

Resolves huggingface#1485, but note that some additional solutions are mentioned in thet issue. This checks that when unloading a PEFT model, if the ModulesToSaveWrapper contains a tuner module, it is correctly unloaded. The unloaded model should not have PEFT layers at the end.

BenjaminBossan mentioned this issue Feb 20, 2024

Fix issue with unloading double wrapped modules #1490

Merged

BenjaminBossan closed this as completed in #1490 Feb 20, 2024

BenjaminBossan mentioned this issue Feb 21, 2024

make all-linear as default for target_modules #1498

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

all-linear + classification models have double-wrapped linear layers #1485

all-linear + classification models have double-wrapped linear layers #1485

nbroad1881 commented Feb 19, 2024

BenjaminBossan commented Feb 19, 2024

all-linear + classification models have double-wrapped linear layers #1485

all-linear + classification models have double-wrapped linear layers #1485

Comments

nbroad1881 commented Feb 19, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

BenjaminBossan commented Feb 19, 2024