FIX: Don't target the classification head when using target_modules="all-linear" #2033

BenjaminBossan · 2024-08-22T12:09:30Z

Fixes #2027

When using a transformers sequence classification model, target_modules="all-linear" should not wrap the classification head with LoRA. This is because it is already wrapped with ModulesToSave, i.e. it will be fully fine-tuned, which is the generally desired behavior.

Before this bug fix, the classification head would be double-wrapped. With #2028, this now raises an error. With this PR, it is avoided completely. Still, keeping #2028 is good because it helps prevent other situations where double-wrapping might occur due to mis-configuration.

Note that there is no fool-proof method to detect the classification head, we have to rely on transformers convention.

Fixes huggingface#2027 When using a transformers sequence classification model, target_modules="all-linear" should not wrap the classification head with LoRA. This is because it is already wrapped with ModulesToSave, i.e. it will be fully fine-tuned, which is the generally desired behavior. Before this bug fix, the classification head would be double-wrapped. With huggingface#2028, this now raises an error. With this PR, it is avoided completely. Still, keeping huggingface#2028 is good because it helps prevent other situations where double-wrapping might occur due to misconfiguration. Note that there is no fool-proof method to detect the classification head, we have to rely on transformers convention.

HuggingFaceDocBuilderDev · 2024-08-22T12:13:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul

Just a single comment regarding the design. But not a blocker.

sayakpaul · 2024-08-23T05:02:53Z

src/peft/tuners/tuners_utils.py

+        cls_head = getattr(model, "score", None) or getattr(model, "classifier", None)
+        if cls_head is not None:
+            last_module_name = [name for name, module in model.named_modules() if module is cls_head][0]
+            module_names_to_exclude.add(last_module_name)


Perhaps we could define a MAP between the task types and the attributes we know we should exclude and use that?

EXCLUSION_MAP = {TaskType.SEQ_CLS: ["score", "classifier"], ...} ... cls_head = None for exclude_candidate in EXLCUSION_MAP[TaskType.SEQ_CLS]: cls_head = getattr(model, exclude_candidate, None) if cls_head is not None: ...

The advantage of that is we just have to update the MAP in case we discover more attrbiutes and it should work out nicely.

WDYT?

Good point about making this easier to extend. I'm not sure if a map is the right approach, because for causal LM (the if condition above), we use a different approach based on get_output_embeddings(), so the map could not be used consistently for that task. But I will move ["score", "classifier"] into a constant and use that.

I still think a map or any adjacent approach is better in the long term but you of course know better here.

sayakpaul · 2024-08-23T05:05:39Z

tests/test_tuners_utils.py

+        assert isinstance(model.base_model.score.original_module, nn.Linear)
+        assert isinstance(model.base_model.score.modules_to_save["default"], nn.Linear)


BenjaminBossan requested a review from sayakpaul August 22, 2024 15:21

sayakpaul reviewed Aug 23, 2024

View reviewed changes

BenjaminBossan added 2 commits August 23, 2024 11:36

Reviewer feedback: Use constant for head names

66ac174

Merge branch 'main' into fix-all-linear-dont-target-classifier-head

83d8e90

BenjaminBossan requested a review from sayakpaul August 23, 2024 09:38

sayakpaul approved these changes Aug 23, 2024

View reviewed changes

BenjaminBossan merged commit 1a5d0f8 into huggingface:main Aug 23, 2024
14 checks passed

BenjaminBossan deleted the fix-all-linear-dont-target-classifier-head branch August 23, 2024 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: Don't target the classification head when using target_modules="all-linear" #2033

FIX: Don't target the classification head when using target_modules="all-linear" #2033

BenjaminBossan commented Aug 22, 2024

HuggingFaceDocBuilderDev commented Aug 22, 2024

sayakpaul left a comment

sayakpaul Aug 23, 2024

BenjaminBossan Aug 23, 2024

sayakpaul Aug 23, 2024

sayakpaul Aug 23, 2024

		assert isinstance(model.base_model.score.original_module, nn.Linear)
		assert isinstance(model.base_model.score.modules_to_save["default"], nn.Linear)

FIX: Don't target the classification head when using target_modules="all-linear" #2033

FIX: Don't target the classification head when using target_modules="all-linear" #2033

Conversation

BenjaminBossan commented Aug 22, 2024

HuggingFaceDocBuilderDev commented Aug 22, 2024

sayakpaul left a comment

Choose a reason for hiding this comment

sayakpaul Aug 23, 2024

Choose a reason for hiding this comment

BenjaminBossan Aug 23, 2024

Choose a reason for hiding this comment

sayakpaul Aug 23, 2024

Choose a reason for hiding this comment

sayakpaul Aug 23, 2024

Choose a reason for hiding this comment