You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why “Learning a Text-Video Embedding from Incomplete and Heterogeneous Data” and “HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips” evaluation protocol different?
Is there a test set of 1k-A and 1k-B each representing 1000 randomly sampled text-video pairs?
I am very confused
The text was updated successfully, but these errors were encountered:
Why “Learning a Text-Video Embedding from Incomplete and Heterogeneous Data” and “HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips” evaluation protocol different?
Is there a test set of 1k-A and 1k-B each representing 1000 randomly sampled text-video pairs?
I am very confused
The text was updated successfully, but these errors were encountered: