-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reproducing fine-tuned tokens. (textual inversion) #24
Comments
Hello bansh123, Thanks for bringing this to my attention, I'll revisit this experiment to determine why you are seeing this behavior. In the meantime, for the latest paper update on OpenReview, we used fine_tune_upstream.py, using the class name of the object as the initialization, and training 4 vectors per concept in the dataset with the We trained these tokens using a batch size of 8, stable diffusion 1.4, and 2000 gradient descent steps, a learning rate of 5.0e-04, scaled by the effective batch size via the Which figure are you working to reproduce? -Brandon |
Thank you for your response. |
Hi, I am trying to reproduce figure 5, could you provide some hints (scripts) for visualization, as there are multiple methods to be evaluated together? |
I've tried using the pascal token provided by the author, but I can't reproduce the performance. Thank you. |
Hi @brandontrabucco, is it possible to share in google drive the new embeddings, which you get by initializing with the original class name and 4 num_vectors? Recomputing the embedding needs a lot of compute resources for more classes. Thanks in advance for considering this request. |
@brandontrabucco
some original class names which have multiple words would have more tokens than 1, e.g. pink primrose in Flowers102 dataset. |
@bansh123 Did you reproduce the results, I am having trouble getting the results presented in the paper (Figure 5 to be specific). Do you have any tips (scripts) that you can share to help me ? Thank you. |
@jsw6872 Did you manage to reproduce the results ? |
How did you resolve the issue ? I would really appreciate your help. Thanks ! |
I have attempted to reproduce the results of the few-shot classification task on the PASCAL VOC dataset.
I managed to achieve comparable outcomes when utilizing the fine-tuned tokens you previously shared via the Google Drive link.
However, I was unsuccessful in reproducing the fine-tuned tokens.
When employing fine_tune.py and aggregate_embeddings.py with the provided scripts, I obtained inferior tokens, resulting in significantly lower accuracy (approximately a 10% gap in 1-shot).
Am I overlooking something?
The text was updated successfully, but these errors were encountered: