Skip to content

wstyczen/sound_processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sound processing

A package meant to provide functionality to analyze and enhance the quality of audio files.

Intended to be used as part of a robot's voice communication system to process recorded voice commands before they are passed to speech-to-text.

Python dependencies

To run the scripts additional packages may need to be installed.

pip3 install audoai-noise-removal noisereduce numpy matplotlib scipy pydub

Usage

Audio enhancement

Common steps used to enhance the overall quality of the audio.

  • Normalization (peak).
  • Filters (low pass & high pass).
  • Cutting out the silent parts of the audio.
  • Eliminating background noise:
    • spectral subtraction / speactral gating (default, does not require api requests via internet, faster)
    • online API which uses an AI model to remove background noise (gives better results then algorithms mentioned above)

Audio plots

Generate plots of given audio to help visualize its state.

  • waveform
  • spectrogram
  • volume over time

Metrics

Calculate some commonly used metrics for the audio (ie Sound Noise Ratio, Spectral Flatness), though they do not seem to be very informative when it comes to assessing audio's quality.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published