You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DoRA could be made faster and to use less memory if the base result were reused for DoRA. However, this is only equivalent if there is no dropout (because the base result will have dropout applied). Therefore, an optimization could be done when dropout=0 (i.e. when nn.Identity is used) or during eval mode.
Motivation
Faster and more memory efficient DoRA when there is no dropout. Experimentally, dropout is not crucial for training DoRA, see this comment.
Your contribution
I can work on this when I have a bit of time but contributions are very welcome.
The text was updated successfully, but these errors were encountered:
Feature request
DoRA could be made faster and to use less memory if the base result were reused for DoRA. However, this is only equivalent if there is no dropout (because the base result will have dropout applied). Therefore, an optimization could be done when dropout=0 (i.e. when
nn.Identity
is used) or during eval mode.Motivation
Faster and more memory efficient DoRA when there is no dropout. Experimentally, dropout is not crucial for training DoRA, see this comment.
Your contribution
I can work on this when I have a bit of time but contributions are very welcome.
The text was updated successfully, but these errors were encountered: