Music recognition and generation using deep learning.
The project currently has three components, which exist as standalone packages:
- Preprocessing - convert NSynth audio .wav files to spectrograms with audiolib.
- Training - train a Convolutional Neural Network to classify instrument spectrograms with
PyTorch
. - Serving - serve the classifier as a RESTful API with
flask
andgunicorn
.
Each package:
- Is callable from the command-line and has configurable parameters. For example, preprocessing is called with:
python -m preprocessing.task
--data_dir path/to/import/raw/data \
--job_dir path/to/export/processed/data \
--filters_dir path/to/import/instrument/filters \
--config $config \
--instruments '["keyboard_acoustic", "guitar_acoustic"]'
-
Contains a JSON file of run configurations for reproducibility. For example, this preprocessing config file:
- Gets parsed as
$config
in the above preprocessing example. - Gets exported by the
training
stage, so that the data used for training can be reproduced.
- Gets parsed as
-
Contains shell scripts to run the package locally, with docker, and to deploy the docker image to cloud with a specific configuration ID. Example training scripts
-
Instrument recognition (current):
- Instrument classification from single note audio
- Instrument detection from multiple note audio (songs)
-
Genre recognition:
- Genre classification from songs
-
Music generation:
- Instrument note generation
- Musical piece generation
- Song generation