Audio Augmentation using AudioSet

AudioSet
- Google's audio dataset that manually annotated audio events
- ontology : https://github.com/audioset/ontology
Goal
- Augment with various sound situation for speech related tasks.
Report on research case
- If you wanna study on specific dataset, it can be not effects for getting better result.
  - In source separation case, it gets higher losses with audioset augmentation.
- But it gets better result on test cases.

How to Use

Installation
- install ffmpeg version 4, and this package

$ apt install -y software-properties-common
$ add-apt-repository ppa:jonathonf/ffmpeg-4
$ apt update
$ apt install -y ffmpeg
$ pip install -e .

Download
- Audioset give us separated meta information that label balanced or not.
- default : balanced

$ python audioset_augmentor/download.py [--file_path='assets/balanced_train_segments.csv' --savedir='.data' --n_jobs=4 --delay=0.05]

Preprocess Audio
- Process adjust volume, sample rate, file type on audio files.

$ python audioset_augmentor/preprocess.py [--master_dir, --out_dir, --meta_path='assets/balanced_train_segments.csv' --out_sr=22050 --min_size=1000000(file checker) --n_jobs=4]

After all, you should set master_dir on assets/default.json for using augment function.

License

'Audioset' license is announced on https://research.google.com/audioset/download.html
- File List
  1. assets/balanced_train_segments.csv
  2. assets/unbalanced_train_segments.csv
  3. assets/ontology.json

The dataset is made available by Google Inc. under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, while the ontology is available under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Other sources are under MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Audio Augmentation using AudioSet

How to Use

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Audio Augmentation using AudioSet

How to Use

License