You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This preserves hyperspherical energy of the pretrained model (the sum of hyperspherical similarity (e.g., cosine similarity) between all pairwise neurons in the same layer), which leads to better generalization, stable training and faster convergence. It is basically a rotation of the neurons
In theory OFT can be applied to any layer and has some interesting interpretations for convolutional layers. For comparison reasons they only trained OFT on the same layers as LoRa in the original paper.
The corresponding repository gained 200 stars in 5 months, so there is definitely interest in the method. I think what is lacking is an easy to use implementation.
Motivation
I stumbled upon this paper and thought that it is a really unique idea of fine-tuning diffusion models and definitely works well in practice. I think an easy to use implementation is all that is lacking for this method to really benefit a lot of users.
Your contribution
I would love to fully contribute this method to peft, if there is no reason not to include it in the main branch.
The text was updated successfully, but these errors were encountered:
Feature request
The paper Controlling Text-to-Image Diffusion by Orthogonal Finetuning proposes a new method of fine-tuning text-to-image diffusion models by using multiple learned orthogonal transformations on the layers of the pretrained model.
This preserves hyperspherical energy of the pretrained model (the sum of hyperspherical similarity (e.g., cosine similarity) between all pairwise neurons in the same layer), which leads to better generalization, stable training and faster convergence. It is basically a rotation of the neurons
In theory OFT can be applied to any layer and has some interesting interpretations for convolutional layers. For comparison reasons they only trained OFT on the same layers as LoRa in the original paper.
The corresponding repository gained 200 stars in 5 months, so there is definitely interest in the method. I think what is lacking is an easy to use implementation.
Motivation
I stumbled upon this paper and thought that it is a really unique idea of fine-tuning diffusion models and definitely works well in practice. I think an easy to use implementation is all that is lacking for this method to really benefit a lot of users.
Your contribution
I would love to fully contribute this method to peft, if there is no reason not to include it in the main branch.
The text was updated successfully, but these errors were encountered: