Skip to content

Latest commit

 

History

History
21 lines (21 loc) · 848 Bytes

README.md

File metadata and controls

21 lines (21 loc) · 848 Bytes

ai_industry_project

Multi-label audio file classification

Preprocessing

Steps and description can be found in feature_extraction

Models

  • Music Tagging transformer (Full implementation of transformer)
  • Wav2Vec transformer (finetuning model)
  • Short-chunk CNN
  • RNN
  • (Naive predictor for baseline metrics)

Structure

  • configs: configuration files to set parameters for preprocessing and the models
  • data: contains preprocesed data and labels
  • images: images used in markdown of the notebooks
  • lib_util: different utilities that were used
  • (models: saved instances of trained models, not uploaded to git)
  • Plots: plotted results of preprocessing, data analysis and training
  • wav_data: raw data, audio files in WAV format
  • archive: old, experimental implementations, unused

Dependencies

use pip install -r requirements.txt