GitHub - sjappig/mldn-capstone: AudioSet classification using RNN

Capstone Project for Machine Learning Nanodegree

Acquiring AudioSet samples

When in directory dataset/audioset/:

Check following dependencies: bash, ffmpeg and youtube-dl
Training dataset: mkdir train_data; cd train_data; cat ../balanced_train_segments.csv | ../download/download.sh
Test dataset: mkdir test_data; cd test_data; cat ../eval_segments.csv | ../download/download.sh

Downloading can be canceled by pressing Ctrl+C (few times if one does not seem to work). When started the next time, it will skip the already downloaded files (you might need to remove .part-files manually). Note that the below scripts won't work if there is very small number of samples (there seems to be some bug in pandas), so download at least ~100 samples for both training and testing before proceeding.

Preprocessing the samples and training models

When in repository root:

Create virtual environment: virtualenv --python=python2.7 sandbox
Activate the environment: source sandbox/bin/activate
Install required packages: pip install -r requirements.txt
Preprocess training data: python -m audiolabel.preprocess pp/train.h5 dataset/audioset/balanced_train_segments.csv dataset/audioset/train_data
Preprocess test data: python -m audiolabel.preprocess --normalize-using pp/train.h5 pp/test.h5 dataset/audioset/eval_segments.csv dataset/audioset/test_data
Train and test model with small number of samples and small epoch count: python -m audiolabel.fit_and_predict pp/train.h5 --N 100 --epochs 100 --validation-size 0 --test pp/test.h5 (this might take several minutes)

Check available command line parameters for both scripts with --help.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
audiolabel		audiolabel
dataset/audioset		dataset/audioset
ontology		ontology
proposal		proposal
report		report
.floydignore		.floydignore
.gitignore		.gitignore
README.md		README.md
explore.ipynb		explore.ipynb
floyd_requirements.txt		floyd_requirements.txt
proposal.pdf		proposal.pdf
report.pdf		report.pdf
requirements.txt		requirements.txt
run_floyd.sh		run_floyd.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Capstone Project for Machine Learning Nanodegree

Acquiring AudioSet samples

Preprocessing the samples and training models

About

Releases

Packages

Languages

sjappig/mldn-capstone

Folders and files

Latest commit

History

Repository files navigation

Capstone Project for Machine Learning Nanodegree

Acquiring AudioSet samples

Preprocessing the samples and training models

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages