-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix compacter init weights #516
Fix compacter init weights #516
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just left some minor comments. Thanks for fixing.
src/transformers/adapters/utils.py
Outdated
@@ -84,6 +84,7 @@ | |||
"scaling": 1.0, | |||
} | |||
ADAPTER_CONFIG_STRING_PATTERN = re.compile(r"^(?P<name>[^\[\]\|\n]+)(?:\[(?P<kvs>.*)\])?$") | |||
SUBMODEL_NAMES = {"clip": ["vision_config", "text_config"], "encoder-decoder": ["encoder", "decoder"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we alternatively place this in wrappers/configuration.py next to the CONFIG_CLASS_KEYS_MAPPING to have all mappings related to model configs in the same place? Could also be helpful to mention this in the contributing docs somewhere
parameters["W_up_left"] = nn.Parameter(W_up_left, requires_grad=True) | ||
parameters["W_up_right"] = nn.Parameter(W_up_right, requires_grad=True) | ||
|
||
def init_shared_parameters(config, in_features, device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A (very short) method doc for this method would be nice :)
return parameters | ||
|
||
|
||
def init_W(config, W_left=None, W_right=None, W=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above :)
Fix resume_from_checkpoint (adapter-hub#514) add initialization of variable so invalid checkpoints throw a understandable error Fix LoRA & (IA)³ implementation for Bart & MBart (adapter-hub#518) Fixes a critical issue in the LoRA & (IA)³ implementation of Bart & MBart, where LoRA & (IA)³ weights were not added to the intermediate and output linear layers of the model's decoder blocks. I.e., adapter configs having intermediate_lora=True or output_lora=True are added incorrectly to (M)Bart models. For LoRA, this does not affect the default config, for (IA)³ it does (intermediate_lora=True). To ensure correct addition of weights in the future, get_adapter() tests are updated to count the number of modules added per adapter. Fix python3.7 Compatibility (adapter-hub#510) Compatibility with python3.8+ and pytorch1.12.1+ Restore compatibility in GPT-2 LoRALinear bias init (adapter-hub#525) Fix compacter init weights (adapter-hub#516) Update doc chapter "Getting Started" (adapter-hub#527) Update version to 3.2.1 Fix Notebook01 Dataset column_rename (adapter-hub#543) Update doc chapter "Adapter Methods" (adapter-hub#535) Do not stale issues labeled as bugs (adapter-hub#550)
Fix compacter shared weight initialization addressed in #515 . This pull request