Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize DoRA computation when there is no dropout #2107

Closed
BenjaminBossan opened this issue Sep 27, 2024 · 2 comments · Fixed by #2122
Closed

Optimize DoRA computation when there is no dropout #2107

BenjaminBossan opened this issue Sep 27, 2024 · 2 comments · Fixed by #2122

Comments

@BenjaminBossan
Copy link
Member

Feature request

DoRA could be made faster and to use less memory if the base result were reused for DoRA. However, this is only equivalent if there is no dropout (because the base result will have dropout applied). Therefore, an optimization could be done when dropout=0 (i.e. when nn.Identity is used) or during eval mode.

Motivation

Faster and more memory efficient DoRA when there is no dropout. Experimentally, dropout is not crucial for training DoRA, see this comment.

Your contribution

I can work on this when I have a bit of time but contributions are very welcome.

@ariG23498
Copy link
Contributor

Hey @BenjaminBossan I would love to work on this.

Should I create a PR and then have the rest of the conversation there?

@BenjaminBossan
Copy link
Member Author

Thanks @ariG23498. Do as you like, if you have code feel free to create a (draft) PR, otherwise discussing here is also fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants