- Introduction
- Requirement
- Training
- Resume from Checkpoints
- Training with a less powerful GPU
- Testing pretrained model
- Half-Precision Training
- Default settings
- Inferencing
- Development
demucs_lightning
├──conf
│ ├─train_test_config.yaml
│ ├─infer_config.yaml
│ │
│
├──demucs
│ ├─demucs.py
│ ├─hdemucs.py
│ ├─other custome modules
│
├──requirements.txt
├──train.py
├──test.py
├──inference.py
│
There are 2 major released version of Demucs.
- Demucs (v2) used waveform as domain.
- Hybrid Demucs (v3) is featuring hybrid source separation.
You can find their model structure in
demucs.py
and hdemucs.py
from demucs folder.
For the official information of Demucs, you can visit facebookresearch/demucs
Demucs is trained by MusdbHQ. This repo uses AudioLoader
to get MusdbHQ dataset .
For more information of Audioloader, you can visit KinWaiCheuk/AudioLoader.
Or else you can download MusdbHQ dataset manually from zenodo.
Python==3.8.10
and ffmpeg
is required to run this repo.
If ffmpeg
is not installed on your machine, you can install it via apt install ffmpeg
You can install all required libraries at once via
pip install -r requirements.txt
TensorBoard logging is used by default. If you want to use the
WandbLogger instead (recommended!), either edit logger
in
conf/train_test_config.yaml
or postpend logger=wandb
to all
your commands.
If it is your first time running the repo, you can use the argument download=True
to automatically download and setup the musdb18hq
dataset. Otherwise, you can omit this argument.
It requires 16,885 MB
of GPU memory. If you do not have enough GPU memory, please read this section.
python train.py devices=[0] model=Demucs download=True
It requires 19,199 MB
of GPU memory.
python train.py devices=[0] model=HDemucs download=True
It is possible to continue training from an existing checkpoint by passing the resume_checkpoint
argument. By default, hydra saves all the checkpoints at 'outputs/YYYY-MM-DD/HH-MM-SS/XXX_experiment_epoch=XXX_augmentation=XXX/version_1/checkpoints/XXX.ckpt'
. For example, if you have a checkpoint trained with 32-bit precision for 100 epochs already via the following command:
python train.py devices=[0] trainer.precision=32 epochs=100
And now you want to train for 50 epochs more, then you can use the following CLI command:
python train.py devices=[0] trainer.precision=16 epochs=150 resume_checkpoint='outputs/2022-05-24/21-20-17/Demucs_experiment_epoch=360_augmentation=True/version_1/checkpoints/e=123-TRAIN_loss=0.08.ckpt'
You can always move you checkpoints to a better place to shorten the path name.
It is possible to reduce the GPU memory required to train the models by using the following tricks. But it might affect the model performance.
You can reduce the batch size to 2
. By doing so, it only requires 10,851 MB
of GPU memory.
python train.py batch_size=2 augment.remix.group_size=2 model=Demucs
You can futher reduce the batch size to 1
if data augmentation is disabled. By doing so, it only requires 7,703 MB
of GPU memory.
python train.py batch_size=1 data_augmentation=False model=Demucs
You can reduce the audio segment length to only 6
. By doing so, it only requires 6,175 MB
of GPU memory.
python train.py batch_size=1 data_augmentation=False segment=6 model=Demucs
You can use test.py
to evaluate the pretrained model directly by using an existing checkpoint. You can give the checkpoint path via resume_checkpoint
argument.
python test.py resume_checkpoint='outputs/2022-05-24/21-20-17/Demucs_experiment_epoch=360_augmentation=True/version_1/checkpoints/e=123-TRAIN_loss=0.08.ckpt'
By default, pytorch lightning uses 32-bit precision for training. To use 16-bit precision (half-precision), you can specify trainer.precision
:
python train.py trainer.precision=16
Double-precision is also supported by specifying trainer.precision=64
.
The full list of arguments and their default values can be found in conf/config.yaml
.
devices: Select which GPU to use. If you have multiple GPUs on your machine and you want to use GPU:2, you can set devices=[2]
. If you want to use DDP (multi-GPU training), you can set devices=2
, it will automatically use the first two GPUs avaliable in your machine. If you want to use GPU:0, GPU:2, and GPU:3 for training, you can set devices=[0,2,3]
.
download: When set to True
, it will automatically download and setup the dataset. Default as False
data_root: Select the location of your dataset. If download=True
, it will become the directory that the dataset is going to be downloaded to. Default as './musdb18hq'
model: Select which version of demucs to use. Default model of this repo is Hybrid Demucs (v3). You can switch to Demucs (v2) by setting the model=Demucs
.
samplerate: The sampling rate for the audio. Default as 44100
.
epochs: The number of epochs to train the model. Default as 360
.
optim.lr: Learning rate of the optimizer. Default as 3e-4
.
You are able to apply your trained model weight on your own audio file by using inference.py
. Some nesscesary argument are the following:
checkpoint
refers to the path of trained model weight checkpoint fileinfer_audio_folder_path
refers to the path of your audio folder where has all the audios insideinfer_audio_ext
refer to the type of your audio. Default value is'wav'
python inference.py infer_audio_folder_path='../../infer_audio' checkpoint='outputs/2022-05-24/21-20-17/Demucs_experiment_epoch=360_augmentation=True/version_1/checkpoints/e=123-TRAIN_loss=0.08.ckpt'
By default, hydra saves all the seperated audio in the outputs
folder.
If you are a developer on this repo, please run:
pre-commit install