Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor] Possible typos in weight initialization #146

Open
awgu opened this issue Nov 11, 2023 · 0 comments
Open

[Minor] Possible typos in weight initialization #146

awgu opened this issue Nov 11, 2023 · 0 comments

Comments

@awgu
Copy link

awgu commented Nov 11, 2023

The recent commit a0a92e0 flipped A and B in the comment for the LoRA Linear module:

LoRA/loralib/layers.py

Lines 119 to 125 in a0a92e0

def reset_parameters(self):
nn.Linear.reset_parameters(self)
if hasattr(self, 'lora_A'):
# initialize B the same way as the default for nn.Linear and A to zero
# this is different than what is described in the paper but should not affect performance
nn.init.kaiming_uniform_(self.lora_A, a=math.sqrt(5))
nn.init.zeros_(self.lora_B)

The LoRA Embedding module similarly has the initialization flipped (not sure if this is intentional):

LoRA/loralib/layers.py

Lines 55 to 60 in a0a92e0

def reset_parameters(self):
nn.Embedding.reset_parameters(self)
if hasattr(self, 'lora_A'):
# initialize A the same way as the default for nn.Linear and B to zero
nn.init.zeros_(self.lora_A)
nn.init.normal_(self.lora_B)

Following the paper, I would expect nn.init.normal_(self.lora_A) and nn.init.zeros_(self.lora_B).

I can open a PR to fix these if you want (though, I cannot seem to save the file without removing trailing whitespaces for some reason).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant