You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You could train the model on any dataset as long as it contains the image-text pair. Maybe you need to find the average length of all captions on Flickr30k. And then you need to adjust the steps of diffusion according to the average length.
No description provided.
The text was updated successfully, but these errors were encountered: