Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReMoRa + DoRa improves on ReMoRa #10

Open
catid opened this issue Jun 7, 2024 · 2 comments
Open

ReMoRa + DoRa improves on ReMoRa #10

catid opened this issue Jun 7, 2024 · 2 comments

Comments

@catid
Copy link

catid commented Jun 7, 2024

Thank you for sharing your results. In return I will share my own:

If you reformulate the code so that during the forward pass, it adds the decompressed MoRa weights into the nn.Linear weights, then you reduce the number of multiplies to the normal number. Furthermore, it becomes compatible with DoRa. In my testing, alternating between repeat and repeat_interleave (ReMoRa) improves on MoRa continued training, and ReMoRa + DoRa improves on ReMoRa.

@kongds
Copy link
Owner

kongds commented Jun 7, 2024

Thanks for sharing the results and advice.

I have tested adding decompressed MoRA to the weight before, but it can be slow in large language models, which needs to copy the entire weight during the forward pass (maybe this can further optimized, since MoRA can directly copy its weight into origin linear instead of multiplication of two matrices like LoRA to merge back).

For ReMoRA + DoRA, are you adding DoRA and MoRA in a linear layer, which seems to use larger trainable parameters than ReMoRA? However, the idea of using both MoRA and LoRA in a linear layer seems interesting, and this might take advantage of both of them.

@catid
Copy link
Author

catid commented Aug 9, 2024

Example: https://github.com/catid/dora/blob/9b2055d0b8dd73890e6fbca585a0e52a6a87dde3/dora.py#L66

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants