The DeepInstruments spec 22 dec 2015
A. Audio features 1. [DONE] Review silence detection. 2. [DONE] Convert features to float32. 3. [CLOSED] Generate silenced frames in test set. 4. [DONE] Review perceptual loudness reference in get_X 5. [DONE] Explicitly ignore WavFileWarning 6. [DONE] For standardization, only collect the mean and variance of activated frames
B. Deep learning 1. [DONE] Write Graph model without Z 2. [DONE] Make it a function in module "learning" 3. [DONE] Solve core dump 4. [DONE] Install bleeding-edge Keras 5. [DONE] Train on categorical cross-entropy 6. [DONE] Write data generator 7. [DONE] Add Z supervision 8. [DONE] Report mean and variance of loss after each epoch
C. Pitch supervision 1. [CLOSED] Get Gt samples for RWC 2. [DONE] Check MIDI offsets in RWC dict 3. [DONE] Write conversion from MIDI to ConvNet axis. 4. [DONE] Patch rankings for The Districts, Vermont 5. [DONE] Extract Z 6. [DONE] Extract G 7. [DONE] Flowing Z and G in the datagen 8. Max-pool G over the size of the decision length. 9. [DONE] Write LambdaMerge function for difference 10. [DONE] Define a tunable weight for the Z loss
D. Evaluation 1. [DONE] Write class-based accuracy measure 2. [DONE] Write callbacks to monitor test error 2. [DONE] Integrate the pipeline into a function so that the whole experiment can be ran in one step. 3. [DONE] Measure class imbalance. How many decision windows per class ? 4. Use MIR metrics for multi-label classif. 5. [DONE] Make a 80/20 file-based split for the retained instruments.
E. Display 1. [DONE] Export filters from conv1 as images. 2. [DONE] Make a figure for the architecture. 3. [DONE] Make a figure for the duration of training set and test set for every instrument in single-label dataset.
F. Dataset
- [DONE] Get the full MedleyDB dataset
- [CLOSED] Update wrangling so that it lists files, not classes
- [DONE] Restrict to a certain number of classes
- [DONE] Take the max of stems activations that play the same instrument
- [DONE] Write a function that outputs Y from the Medley instrument activations, called by generator
- [DONE] Upload MedleyDB on di and cerfeuil
- [DONE] Extract annotated vs non-annotated files for single-label classes
- [DONE] If there are several stems of the same instrument in a given track, discard non-annotated stems from test set
- [DONE] Separate singers between training set and test set to avoid artist bias
- [DONE] Report misnomer of CroqueMadame_Pilot(Lakelot)_ACTIVATION_CONF.lab
- [DONE] Make a patch script in _init.py to handle all misnomers
- [DONE] Discard overdrive, shoegaze, bleed and inactivity in clean electric guitar
- [DONE] Use version control for the medleydb-single-instruments derived dataset
- [DONE] Remove vocal FX tracks
- [DONE] Remove first and last chunk (half-silent by definition) of every track
G. Single-label classification
- [DONE] Write get_activation
- [DONE] Write get_indices (with boundary trimming)
- [DONE] Write get_melody
- [DONE] Memoize training X with joblib
- [DONE] Write a dedicated generator
- [DONE] Standardize X in the generator
- [DONE] Train deep neural network on X and Y
- [DONE] Memoize test X with joblib
- [DONE] Report class-wise accuracy with error bars
H. Descriptors + Random forests baseline
- [DONE] Compute MFCCs on the training data
- [DONE] Also Delta and Delta-Delta MFCCs
- [DONE] Also centroid, bandwidth, contrast, rolloff
- [DONE] Generate half-overlapping chunks of X
- [DONE] Summarize with mean and variance over chunks
- [DONE] Generate Y's as integer classes
- [DONE] Same in test set
- [DONE] Run scikit-learn's random forest on it
- [DONE] Report class-wise accuracy with error bars
- [DONE] Discard clean guitar and male singer
- [DONE] Bugfix half-overlapping chunks
- [DONE] More Tp, Cl, Fl, Pn, and Vl examples (from solosDb)
I. Structured validation
- [DONE] Extract the stem folder of each chunk path
- [DONE] Assign votes to a dict where stems are keys
- [DONE] Get the true class of each stem
- [DONE] Write a systematic structured evaluator
J. Reproducibility
- List all operations that are necessary
- Review this list on gentiane
K. Scattering transform
- [DONE] Write function get_paths in MATLAB
- [DONE] Compute joint scattering features
- Compute plain scattering features
- Compute spiral scattering features
- Review the importance of log compression
- [DONE] Export in HDF5 from MATLAB to Python
- Check that paths are ordered like in Python
- Load HDF5, train RF, report accuracy
L. References
- Fuhrmann: musical instrument classification
- Joder et al. musical instrument classification
- Dieleman and Benjamin deep learning for audio
- Humphrey, Bello, and LeCun deep architectures for music informatics
- Salamon and Bello : feature learning
- [DONE] Li, Qian and Wang: ConvNets on raw audio for multilabel instrument recognition
- [DONE] McFee et al. librosa
- [DONE] Kingma & Ba: Adam optimizer
- [DONE] Chollet: Keras package
- [DONE] Bittner et al. MedleyDB
- [CLOSED] Bruna, Szlam, LeCun. Learning Stable Group Invariant Representations.
- [CLOSED] Mallat 2016. Understanding Deep Convolutional Networks.
M.