Skip to content

Demucs Lightning: A PyTorch lightning version of Demucs with Hydra and Tensorboard features

Notifications You must be signed in to change notification settings

KinWaiCheuk/demucs_lightning

Repository files navigation

Demucs lightning

  1. Introduction
  2. Requirement
  3. Training
    1. Demucs
    2. HDemucs
  4. Resume from Checkpoints
  5. Training with a less powerful GPU
  6. Testing pretrained model
  7. Half-Precision Training
  8. Default settings
  9. Inferencing
  10. Development

Introduction

demucs_lightning
├──conf
│     ├─train_test_config.yaml
│     ├─infer_config.yaml
│     │
│
├──demucs
│     ├─demucs.py
│     ├─hdemucs.py
│     ├─other custome modules
│
├──requirements.txt
├──train.py
├──test.py
├──inference.py
│   

There are 2 major released version of Demucs.

  • Demucs (v2) used waveform as domain.
  • Hybrid Demucs (v3) is featuring hybrid source separation.

You can find their model structure in demucs.py and hdemucs.py from demucs folder.
For the official information of Demucs, you can visit facebookresearch/demucs

Demucs is trained by MusdbHQ. This repo uses AudioLoader to get MusdbHQ dataset .
For more information of Audioloader, you can visit KinWaiCheuk/AudioLoader.

Or else you can download MusdbHQ dataset manually from zenodo.

Requirement

Python==3.8.10 and ffmpeg is required to run this repo.

If ffmpeg is not installed on your machine, you can install it via apt install ffmpeg

You can install all required libraries at once via

pip install -r requirements.txt

Logging

TensorBoard logging is used by default. If you want to use the WandbLogger instead (recommended!), either edit logger in conf/train_test_config.yaml or postpend logger=wandb to all your commands.

Training

If it is your first time running the repo, you can use the argument download=True to automatically download and setup the musdb18hq dataset. Otherwise, you can omit this argument.

Demucs

It requires 16,885 MB of GPU memory. If you do not have enough GPU memory, please read this section.

python train.py devices=[0] model=Demucs download=True

HDemucs

It requires 19,199 MB of GPU memory.

python train.py devices=[0] model=HDemucs download=True

Resume from Checkpoints

It is possible to continue training from an existing checkpoint by passing the resume_checkpoint argument. By default, hydra saves all the checkpoints at 'outputs/YYYY-MM-DD/HH-MM-SS/XXX_experiment_epoch=XXX_augmentation=XXX/version_1/checkpoints/XXX.ckpt'. For example, if you have a checkpoint trained with 32-bit precision for 100 epochs already via the following command:

python train.py devices=[0] trainer.precision=32 epochs=100

And now you want to train for 50 epochs more, then you can use the following CLI command:

python train.py devices=[0] trainer.precision=16 epochs=150 resume_checkpoint='outputs/2022-05-24/21-20-17/Demucs_experiment_epoch=360_augmentation=True/version_1/checkpoints/e=123-TRAIN_loss=0.08.ckpt'

You can always move you checkpoints to a better place to shorten the path name.

Training with a less powerful GPU

It is possible to reduce the GPU memory required to train the models by using the following tricks. But it might affect the model performance.

Reduce Batch Size

You can reduce the batch size to 2. By doing so, it only requires 10,851 MB of GPU memory.

python train.py batch_size=2 augment.remix.group_size=2 model=Demucs

Disable Augmentation

You can futher reduce the batch size to 1 if data augmentation is disabled. By doing so, it only requires 7,703 MB of GPU memory.

python train.py batch_size=1 data_augmentation=False model=Demucs

Reduce Audio Segment Length

You can reduce the audio segment length to only 6. By doing so, it only requires 6,175 MB of GPU memory.

python train.py batch_size=1 data_augmentation=False segment=6 model=Demucs

Testing pretrained model

You can use test.py to evaluate the pretrained model directly by using an existing checkpoint. You can give the checkpoint path via resume_checkpoint argument.

python test.py resume_checkpoint='outputs/2022-05-24/21-20-17/Demucs_experiment_epoch=360_augmentation=True/version_1/checkpoints/e=123-TRAIN_loss=0.08.ckpt'

Half-Precision Training

By default, pytorch lightning uses 32-bit precision for training. To use 16-bit precision (half-precision), you can specify trainer.precision:

python train.py trainer.precision=16

Double-precision is also supported by specifying trainer.precision=64.

Default settings

The full list of arguments and their default values can be found in conf/config.yaml.

devices: Select which GPU to use. If you have multiple GPUs on your machine and you want to use GPU:2, you can set devices=[2]. If you want to use DDP (multi-GPU training), you can set devices=2, it will automatically use the first two GPUs avaliable in your machine. If you want to use GPU:0, GPU:2, and GPU:3 for training, you can set devices=[0,2,3].

download: When set to True, it will automatically download and setup the dataset. Default as False

data_root: Select the location of your dataset. If download=True, it will become the directory that the dataset is going to be downloaded to. Default as './musdb18hq'

model: Select which version of demucs to use. Default model of this repo is Hybrid Demucs (v3). You can switch to Demucs (v2) by setting the model=Demucs.

samplerate: The sampling rate for the audio. Default as 44100.

epochs: The number of epochs to train the model. Default as 360.

optim.lr: Learning rate of the optimizer. Default as 3e-4.

Inferencing

You are able to apply your trained model weight on your own audio file by using inference.py. Some nesscesary argument are the following:

  • checkpoint refers to the path of trained model weight checkpoint file
  • infer_audio_folder_path refers to the path of your audio folder where has all the audios inside
  • infer_audio_ext refer to the type of your audio. Default value is 'wav'
python inference.py infer_audio_folder_path='../../infer_audio' checkpoint='outputs/2022-05-24/21-20-17/Demucs_experiment_epoch=360_augmentation=True/version_1/checkpoints/e=123-TRAIN_loss=0.08.ckpt'

By default, hydra saves all the seperated audio in the outputs folder.

Development

If you are a developer on this repo, please run:

pre-commit install

About

Demucs Lightning: A PyTorch lightning version of Demucs with Hydra and Tensorboard features

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages