Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[help] Can the training process only be implemented through m-LoRA? #11

Open
zxy1728 opened this issue Jul 30, 2024 · 2 comments
Open
Labels
help wanted Extra attention is needed

Comments

@zxy1728
Copy link

zxy1728 commented Jul 30, 2024

Excuse me, can the training process only be implemented through mlora? But that doesn't match my own torch and transformer versions, is there a solution?

@mikecovlee
Copy link
Member

For the moment. This is because it is difficult to pass router loss without modifying the transformers library. If you don't need router loss, the current code should also be able to support training with simple modifications.

@mikecovlee mikecovlee added the help wanted Extra attention is needed label Jul 30, 2024
@mikecovlee mikecovlee changed the title Can the training process only be implemented through mlora? [help] Can the training process only be implemented through m-LoRA? Jul 30, 2024
@mikecovlee
Copy link
Member

Also, the torch version and transformers version of m-LoRA should be able to drop a little, not too much impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants