Skip to content

Sound augmentation using Large-scale audio dataset (Audioset)

Notifications You must be signed in to change notification settings

AppleHolic/audioset_augmentor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hits

Audio Augmentation using AudioSet

  • AudioSet

  • Goal

    • Augment with various sound situation for speech related tasks.
  • Report on research case

    • If you wanna study on specific dataset, it can be not effects for getting better result.
    • But it gets better result on test cases.

How to Use

  • Installation
    • install ffmpeg version 4, and this package
$ apt install -y software-properties-common
$ add-apt-repository ppa:jonathonf/ffmpeg-4
$ apt update
$ apt install -y ffmpeg
$ pip install -e .
  • Download
    • Audioset give us separated meta information that label balanced or not.
    • default : balanced
$ python audioset_augmentor/download.py [--file_path='assets/balanced_train_segments.csv' --savedir='.data' --n_jobs=4 --delay=0.05]
  • Preprocess Audio
    • Process adjust volume, sample rate, file type on audio files.
$ python audioset_augmentor/preprocess.py [--master_dir, --out_dir, --meta_path='assets/balanced_train_segments.csv' --out_sr=22050 --min_size=1000000(file checker) --n_jobs=4]
  • After all, you should set master_dir on assets/default.json for using augment function.

License

The dataset is made available by Google Inc. under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, while the ontology is available under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

  • Other sources are under MIT License