A package meant to provide functionality to analyze and enhance the quality of audio files.
Intended to be used as part of a robot's voice communication system to process recorded voice commands before they are passed to speech-to-text.
To run the scripts additional packages may need to be installed.
pip3 install audoai-noise-removal noisereduce numpy matplotlib scipy pydub
Common steps used to enhance the overall quality of the audio.
- Normalization (peak).
- Filters (low pass & high pass).
- Cutting out the silent parts of the audio.
- Eliminating background noise:
- spectral subtraction / speactral gating (default, does not require api requests via internet, faster)
- online API which uses an AI model to remove background noise (gives better results then algorithms mentioned above)
Generate plots of given audio to help visualize its state.
- waveform
- spectrogram
- volume over time
Calculate some commonly used metrics for the audio (ie Sound Noise Ratio, Spectral Flatness), though they do not seem to be very informative when it comes to assessing audio's quality.