-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLIP feature #6
Comments
Example of extracting Flickr8k clip features are given in https://github.com/xu-shitong/flickr8k-CLIP-freature/blob/master/building/datatensor_creator.py . If you wish to try on another dataset, you need an encoder, like the clip model, to extract samples feature vectors as input to the diffusion model. |
Hello, I have identified some issues in the code above. It appears that the code retrieves the clip feature image using the line |
Yes, you are right... The code is just an example of how to extract features. Different code is used to extract Flickr30k dataset feature. |
Thank you for your responses and contributions. Based on your input, I have utilized the provided code to extract feature files from the Flickr30k dataset. Subsequently, I trained the model using the Flickr30+8k dataset, resulting in a notable increase in the Bleu-4 score (30.7). This outcome appears to deviate significantly from the results mentioned in the paper. To rectify this disparity, I intend to reference your code once more to re-extract the features and retrain the model. |
Hello, in datatensor_creator.py, if I want to get image_all_final.pickle instead of image_all_40.pickle, do I just need to change the value of 'start' from 40000 to 0? The code you provided doesn't seem to be complete. |
Yes, that's correct. The code is written in this way only because my machine was not able to extract all the features for the dataset in one go, so I had to manually restrict the program to extract a subset of samples' feature, and combine the features later. |
Thank you for sharing such great code. But I have a question, how did you extract the features of the flickr dataset? I want to change this data to another dataset to see the effect, but I don't know how you extracted the features, could you please give me some advice
The text was updated successfully, but these errors were encountered: