Google Audio Set classification with Keras and tensorflow

Audio Set [1] is a large scale weakly labelled dataset containing over 2 million 10-second audio clips with 527 classes published by Google in 2017.

This codebase is an implementation of [2, 3], where attention neural networks are proposed for Audio Set classification and achieves a mean average precision (mAP) of 0.360.

Download dataset

Audioset.

Run

Users may optionaly choose tensorflow in runme.sh to run the code.

./runme.sh

Results

Mean average precision (mAP) of different models.

References

[1] Gemmeke, Jort F., et al. "Audio set: An ontology and human-labeled dataset for audio events." Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2017.

[2] Kong, Qiuqiang, et al. "Audio Set classification with attention model: A probabilistic perspective." arXiv preprint arXiv:1711.00927 (2017).

[3] Yu, Changsong, et al. "Multi-level Attention Model for Weakly Supervised Audio Classification." arXiv preprint arXiv:1803.02353 (2018).

External links

The original implmentation of [3] is created by Changsong Yu https://github.com/ChangsongYu/Eusipco2018_Google_AudioSet

Contact

Bin Wang (wang.bin # gmx.com)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Google Audio Set classification with Keras and tensorflow

Download dataset

Run

Results

References

External links

Contact

Files

README.md

Latest commit

History

README.md

File metadata and controls

Google Audio Set classification with Keras and tensorflow

Download dataset

Run

Results

References

External links

Contact