Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QOL improvements and doc updates #1318

Merged
merged 10 commits into from
Jan 12, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/developer_guides/lora.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,4 +135,4 @@ model.unload()

# delete adapter
model.delete_adapter("dpo")
```
```
1 change: 1 addition & 0 deletions src/peft/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,5 +83,6 @@
set_peft_model_state_dict,
shift_tokens_right,
load_peft_weights,
cast_non_trainable_to_dtype,
pacman100 marked this conversation as resolved.
Show resolved Hide resolved
)
from .config import PeftConfig, PromptLearningConfig
1 change: 1 addition & 0 deletions src/peft/utils/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,5 +47,6 @@
get_auto_gptq_quant_linear,
get_quantization_config,
id_tensor_storage,
cast_non_trainable_to_dtype,
)
from .save_and_load import get_peft_model_state_dict, set_peft_model_state_dict, load_peft_weights
20 changes: 20 additions & 0 deletions src/peft/utils/other.py
Original file line number Diff line number Diff line change
Expand Up @@ -498,3 +498,23 @@ def id_tensor_storage(tensor: torch.Tensor) -> Tuple[torch.device, int, int]:
unique_id = storage_ptr(tensor)

return tensor.device, unique_id, storage_size(tensor)


def cast_non_trainable_to_dtype(model, dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticing that we may want to use a different name, since the trainable parameters will also be cast. The name could give the impression that trainable parameters are left untouched.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR to use cast_mixed_precision_params, let me know if you have better naming suggestions.

"""
Cast all non-trainable parameters of the model to the given `dtype`. The trainable parameters are casted to full
pacman100 marked this conversation as resolved.
Show resolved Hide resolved
precision. This is meant to reduce the GPU memory usage when using PEFT methods by using half-precision dtype for
non-trainable parameters. Having the trainable parameters in full-precision preserves training stability when using
automatic mixed precision training.

Args:
model (`torch.nn.Module`):
The model to cast the non-trainable parameters of.
dtype (`torch.dtype`):
The dtype to cast the non-trainable parameters to.
"""
for p in model.parameters():
if not p.requires_grad:
p.data = p.to(dtype)
else:
p.data = p.to(torch.float32)
30 changes: 28 additions & 2 deletions src/peft/utils/peft_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,22 @@


class PeftType(str, enum.Enum):
"""Enum class for the different types of adapters in PEFT."""
"""
Enum class for the different types of adapters in PEFT.

Supported PEFT types:
- PROMPT_TUNING
- MULTITASK_PROMPT_TUNING
- P_TUNING
- PREFIX_TUNING
- LORA
- ADALORA
- ADAPTION_PROMPT
- IA3
- LOHA
- LOKR
- OFT
"""

PROMPT_TUNING = "PROMPT_TUNING"
MULTITASK_PROMPT_TUNING = "MULTITASK_PROMPT_TUNING"
Expand All @@ -36,7 +51,18 @@ class PeftType(str, enum.Enum):


class TaskType(str, enum.Enum):
"""Enum class for the different types of tasks supported by PEFT."""
"""
Enum class for the different types of tasks supported by PEFT.

Overview of the supported task types:
- SEQ_CLS: Text classification.
- SEQ_2_SEQ_LM: Sequence-to-sequence language modeling.
- Causal LM: Causal language modeling.
- TOKEN_CLS: Token classification.
- QUESTION_ANS: Question answering.
- FEATURE_EXTRACTION: Feature extraction. Provides the hidden states which can be used as embeddings or features
for downstream tasks.
"""

SEQ_CLS = "SEQ_CLS"
SEQ_2_SEQ_LM = "SEQ_2_SEQ_LM"
Expand Down
Loading