The phoneme classification code for EUSIPCO 2017 paper review:
Timbre Analysis of Music Audio Signals with Convolutional Neural Networks
- Clone this repository
- Download Jingju a capella singing dataset from http://doi.org/10.5281/zenodo.344932
- Change
dataset_path
variable inparameters.py
to locate the above dataset - Install dependencies (see below)
- Choose
dataset
inparameters.py
to run experiment on dan or laosheng dataset - Run experiment by `python doPhonemeClassification.py'
- Execute the steps 1, 2, 3 in Steps for reproducting the experiment results
- Choose
dataset
andam
variables inparameters
. Example,dataset='qmLonUpfLaosheng'
andam='cnn'
means we would like to extract the laosheng features for convolutional neural networks (proposed, Choi models). - Run
python phonemeSampleCollection.py
to extract the mel bands features - Code for extracting features for MLP model is not included.
- Download pre-computed mel-bands features from http://doi.org/10.5281/zenodo.344935
- Create a folder named
trainingData
in the root of this repository, then put all '.pickle.gz` feature files into this folder - If you don't want to download the pre-computed features, please follow Steps for calculating the mel bands features
- The model training code are located in
pretrainedDLModels
folder.keras_cnn*
code is for training CNN models (proposed and Choi modes).keras_dnn*
code is for training MLP model - To train GMM models, please set
am='gmm'
inparameters.py
, then execute steps 1, 2 in Steps for calculating the mel bands features
Steps for reproducting the experiment results requires below packages:
python2 numpy scipy scikit-learn matplotlib essentia
Steps for calculating the mel bands features requires below packages:
python2 numpy scipy scikit-learn essentia
Steps for training proposed, Choi, MLP and GMM models requires below packages:
python2 numpy scipy scikit-learn essentia keras theano hyperot
Affero GNU General Public License version 3