Issue on datasets #35

LindaCY · 2019-10-19T06:22:49Z

Hello, thanks for your great work. Here are some questions when I read your paper and implement the work. In the paper, you said "you trained your model on a dataset of approximately 750,000 videos sampled from AudioSet." As we know, AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos.

So the AudioSet you used for training model, is human-labeled 10-second clips drawn from YouTube videos?
Can we download the full videos according to YouTubeID provided in audioset and spilt them to video clips for training? Actually, we have trained with this dataset for serval models. But some of them always predict labels as "1" (aligned), and others always predict labels as "0" (not aligned).

ruizewang · 2019-10-24T07:33:25Z

The same question, follow this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue on datasets #35

Issue on datasets #35

LindaCY commented Oct 19, 2019 •

edited

Loading

ruizewang commented Oct 24, 2019

Issue on datasets #35

Issue on datasets #35

Comments

LindaCY commented Oct 19, 2019 • edited Loading

ruizewang commented Oct 24, 2019

LindaCY commented Oct 19, 2019 •

edited

Loading