Speech denoising systems usually enhance only the magnitude spectrum while leaving the phase spectrum. This system try to improve the performance of denoising system based on denoising autoencoder neural network. The estimation of clean audio is computed by complex ideal ratio mask to enhance the phase information.
Input : audio data on mel-frequency domain
Output: complex ratio mask (cRM)[1]
This model built in linear shape (2049-500-180) without weight lock[2].
youtube-dl : a command-line program to download videos from YouTube.com and a few more sites
SoX : a cross-platform command line utility to convert various formats of audio files in to other formats
FFmpeg : a complete, cross-platform solution to record, convert and stream audio and video
librosa : python package for music and audio analysis
[2] Speech Synthesis with Deep Denoising Autoencoder, Zhenzhou Wu