Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

where is 2M audioset data and pretrain_audioset2M.sh? #21

Open
JHjang223 opened this issue Aug 10, 2023 · 3 comments
Open

where is 2M audioset data and pretrain_audioset2M.sh? #21

JHjang223 opened this issue Aug 10, 2023 · 3 comments

Comments

@JHjang223
Copy link

Thank you meta for your hard work on the audioMAE implementation.
I want to train with 2M data, but in fact, audioset is only releasing features, so I couldn't get the data. I was finally able to get 20k data from another website. Where do I download the 2M data and I can't find pretrain_audioset2M.sh. Check please.

@Gariscat
Copy link

Same issue...... I checked the website and also only found the features instead of the original waveforms. How should we get the raw data or the raw data is not released at all?

@Jingerjia
Copy link

My stupid solution is:
Download the html of the class, and you'll find each video has it's youtube-id, start time, end time, and labels.
Then we can download every video we need by analyzing the html of the classes.
Good luck!

@IvanBirkmaier
Copy link

You can also use the .wav data which is provided by Huggingface: https://huggingface.co/datasets/confit/audioset-full or Baidu: https://pan.baidu.com/s/13WnzI1XDSvqXZQTS-Kqujg, password: 0vc2 (source: https://github.com/qiuqiangkong/audioset_tagging_cnn).

In the Hugginface dataset (eval) there is one broken file: ID YmW3... (if i remember right) delete this one it can cause headach :D
After downloading the data you have to create an train and eval json like they did in AST (https://github.com/YuanGongND/ast) (see egs/audioset/datafiles/sample_...) don't forget you just need audio an label!!!

must look like this:

    {
        "wav": "your path to wav file (doesn't have to be .flac file -> torchaudio supports both)",
        "labels": "/m/068hy,/m/07q6cd_,/m/0bt9lr,/m/0jbk"
    },

The label mapping for the wav-files/data can be done with the https://github.com/audioset/ontology and the provided CSV files (balanced_train_segments.csv, etc.) given on Audioset website: https://research.google.com/audioset/download.html

Good Luck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants