Skip to content

peak1995/segan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SEGAN

A PyTorch implementation of SEGAN based on the paper SEGAN: Speech Enhancement Generative Adversarial Network.

Requirements

conda install pytorch torchvision -c pytorch
  • librosa
pip install librosa
  • tqdm
conda install tqdm

Datasets

The clear and noisy speech datasets are downloaded from DataShare. Download the 56kHZ train datasets and test datasets, then extract them into data directory.

If you want using other datasets, you should change the path of data defined on data_preprocess.py.

Usage

Data Pre-process

python data_preprocess.py

The pre-processed datas are on data/serialized_train_data and data/serialized_test_data.

Train Model and Test

python main.py ----batch_size 128 --num_epochs 300
optional arguments:
--batch_size             train batch size [default value is 50]
--num_epochs             train epochs number [default value is 86]

The test results are on results.

Test Audio

python test_audio.py ----file_name p232_160.wav --epoch_name generator-80.pkl
optional arguments:
--file_name              audio file name
--epoch_name             generator epoch name

The generated enhanced audio is on the same directory of input audio.

Results

The example results and the pre-train Generator weight can be downloaded from here.

About

基于gan的语音增强

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages