You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, it is a good work.
However, during the reproduction process, you only used 6.9M images out of the entire 11M SA-1B dataset. Can you release the exact image list to facilitate our reproduction? Similarly, the conceptual-12m dataset also used part of it.
Thanks!
The text was updated successfully, but these errors were encountered:
For SA-1B, it is crucial to filter those watermarked images, we didn't have a good detector so we adoptedd a naive way by filtering those prompts contianing human-related words. There may be better ways to filter these image datasets.
For SA-1B, it is crucial to filter those watermarked images, we didn't have a good detector so we adoptedd a naive way by filtering those prompts contianing human-related words. There may be better ways to filter these image datasets.
Thank you for your reply. Could you please release the exact image list if possible? Manually filtering the data would be both time-consuming and inefficient.
Hi, it is a good work.
However, during the reproduction process, you only used 6.9M images out of the entire 11M SA-1B dataset. Can you release the exact image list to facilitate our reproduction? Similarly, the conceptual-12m dataset also used part of it.
Thanks!
The text was updated successfully, but these errors were encountered: