cricket songs classification

📍 Cricket songs classification fined tuned on AST transformers

📚 Table of Contents

📚 Table of Contents
📍Overview
- Results
⚙️ Project Structure
💻 Modules
🚀 Getting Started
- 💻 Installation
- 🤖 Training Model
🤝 Contributing
License
Acknowledgments

📍Overview

The Cricket Classification GitHub project is an audio classification system that utilizes deep learning techniques to identify and categorize cricket species based on their sound recordings. The project leverages the PyTorch Lightning framework and the ASTForAudioClassification model from Hugging Face's Transformers library to build and train the classifier. The code includes data preprocessing, model training, and evaluation, providing a complete end-to-end solution for cricket sound classification tasks.

Results

Experiment	Test Accuracy
5 genus classification	97.00%
8 genus classification	94.40%
10 genus classification	89.51%

These results are obtained on test data using an 80:20 train:test split. The train and test waveforms are split into 10-second segments with a 5-second overlap.

⚙️ Project Structure

.
├── config.json
├── data
│ ├──  final_features
│ ├──  raw_all_data
│ └──  vad_processed
├── dataset.py
├── feature_extractor.py
├── helpers
│ ├──  data.txt
│ ├──  make_data.py
│ └──  make_data_dir.sh
├── main.py
├── preprocess.py
├── readme.md
├── requirements.txt
├── run_pipeline.sh
├── utils.py
└── val.py

💻 Modules

File	Summary
run_pipeline.sh	Runs complete pipeline. (preprocess, feature extraction and trains the model)
preprocess.py	This script processes a set of audio files for machine learning purposes, using the Silero Voice Activity Detector (VAD) model to extract relevant speech segments.
dataset.py	This script defines a CustomDataset class that inherits from PyTorch's Dataset class, tailored for processing audio data related to cricket sounds.
utils.py	This script demonstrates how to remove human voice from an audio file using the Silero Voice Activity Detector (VAD) model.
feature_extractor.py	This script extracts features from audio samples using a pre-trained feature extractor from the `transformers` library. The `process_samples_in_batches` function processes audio samples in batches, applying the feature extractor to each sample and storing the extracted features along with the sample's label.
main.py	This script trains a cricket audio classifier using a pre-trained ASTForAudioClassification model from the `transformers` library.

🚀 Getting Started

💻 Installation

Clone the readme-ai repository:

git clone  https://github.com/pvbhanuteja/cricket-classification

Change to the project directory:

cd  cricket-classification

Install the dependencies:

pip install  -r  requirements.txt

🤖 Training Model

# Update config.json with correct paths then run shell script
sh run_pipeline.sh

🤝 Contributing

Check out CONTRIBUTING.md for best practices and instructions on how to contribute to this project.

License

This project is licensed under the MIT License.

Acknowledgments

Professor Dr. Yoonsuck Choe.
This work was supported in part by the Texas Virtual Data Library (ViDaL) funded by the Texas A&M University Research Development Fund.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cricket songs classification