You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> In practice, to avoid rapid overfitting and stabilize the fine-tuning, we re-weight ReFL loss and regularize with pre-training loss.
Our code is only used to display the ReFL loss, you will need to add the pre-training loss according to your own settings.
Thx for your reply.
In this answer (#24 (comment)), you mentioned that 'it is simpler to use ReFL alone directly and to achieve decent results.' According to your statement, using only the ReFL loss should yield reasonably good results, but I am unable to achieve that. It seems that the loss has already converged.
Additionally, the paper mentions: 'the pre-training dataset is from a 625k subset of LAION-5B [50] selected by aesthetic score.' I wonder if you plan to release this part of the dataset.
The performance of my ReFL model after training on refl_data.json(https://github.com/THUDM/ImageReward/blob/main/data/refl_data.json) is significantly worse than the untrained SD1.4 model. The results are far from satisfactory, and I'm not sure what might be causing this issue.
Training Settings:
GPUs: 2 * A100 GPU
--train_batch_size: 8
--gradient_accumulation_steps: 4
--num_train_epochs: 100
--learning_rate: 1e-5
Given:
seed: 100
prompt: a coffee mug made of cardboard
Result of untrained sd1.4:
Result of trained ReFL:
Could you please explain this phenomenon?
The text was updated successfully, but these errors were encountered: