Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all-linear + classification models have double-wrapped linear layers #1485

Closed
4 tasks
nbroad1881 opened this issue Feb 19, 2024 · 1 comment · Fixed by #1490
Closed
4 tasks

all-linear + classification models have double-wrapped linear layers #1485

nbroad1881 opened this issue Feb 19, 2024 · 1 comment · Fixed by #1490

Comments

@nbroad1881
Copy link

System Info

all

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

m = AutoModelForTokenClassification.from_pretrained("microsoft/deberta-v3-xsmall", num_labels=10)

config = LoraConfig(
        r=16,
        lora_alpha=32,
        lora_dropout=0.1,
        task_type="TOKEN_CLS",
        target_modules="all-linear",
        bias="none",
    )

pmod = get_peft_model(m, config)

merged = pmod.merge_and_unload()

merged will still have lora layers in the classifier because it gets double-wrapped due to "all-linear"

Expected behavior

Merged shouldn't have lora layers still in the classifier because calling merged.save_pretrained will result in uninitialized weights upon loading again

@BenjaminBossan
Copy link
Member

Thanks a lot Nicholas. Adding more context from our internal discussion: With the given LoraConfig, we get a PEFT model that has a ModulesToSaveWrapper on top of a lora.Linear layer:

      ...
      (classifier): ModulesToSaveWrapper(
        (original_module): lora.Linear(
          (base_layer): Linear(in_features=384, out_features=10, bias=True)
          (lora_dropout): ModuleDict(
            (default): Dropout(p=0.1, inplace=False)
          )
          (lora_A): ModuleDict(
            (default): Linear(in_features=384, out_features=16, bias=False)
          )
          (lora_B): ModuleDict(
            (default): Linear(in_features=16, out_features=10, bias=False)
          )
          (lora_embedding_A): ParameterDict()
          (lora_embedding_B): ParameterDict()
        )
        (modules_to_save): ModuleDict(
          (default): lora.Linear(
            (base_layer): Linear(in_features=384, out_features=10, bias=True)
            (lora_dropout): ModuleDict(
              (default): Dropout(p=0.1, inplace=False)
            )
            (lora_A): ModuleDict(
              (default): Linear(in_features=384, out_features=16, bias=False)
            )
            (lora_B): ModuleDict(
              (default): Linear(in_features=16, out_features=10, bias=False)
            )
            (lora_embedding_A): ParameterDict()
            (lora_embedding_B): ParameterDict()
          )
        )
      )

This is already undesired and we should not allow this. To fix this, we can:

  1. Ensure that we fully unwrap the layers.
  2. Add more checks to
    output_emb = model.get_output_embeddings()
    if output_emb is not None:
    last_module_name = [name for name, module in model.named_modules() if module is output_emb][0]
    linear_module_names -= {last_module_name}
    to remove "classifier" layers when the task type is classification.

In addition, I think it's dubious to have a ModulesToSaveWrapper on top of a LoraLayer. I wonder if we should raise an error right there when we encounter this, or is there any scenario where this is desired?

Ping @younesbelkada @pacman100

BenjaminBossan added a commit to BenjaminBossan/peft that referenced this issue Feb 20, 2024
Resolves huggingface#1485, but note that some additional solutions are mentioned in
thet issue.

This checks that when unloading a PEFT model, if the
ModulesToSaveWrapper contains a tuner module, it is correctly unloaded.
The unloaded model should not have PEFT layers at the end.
BenjaminBossan added a commit that referenced this issue Feb 20, 2024
Resolves #1485, but note that some additional solutions are mentioned in
thet issue.

This checks that when unloading a PEFT model, if the
ModulesToSaveWrapper contains a tuner module, it is correctly unloaded.
The unloaded model should not have PEFT layers at the end.
BenjaminBossan added a commit to BenjaminBossan/peft that referenced this issue Mar 14, 2024
Resolves huggingface#1485, but note that some additional solutions are mentioned in
thet issue.

This checks that when unloading a PEFT model, if the
ModulesToSaveWrapper contains a tuner module, it is correctly unloaded.
The unloaded model should not have PEFT layers at the end.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants