This is Team Epoch IV's solution to the BirdCLEF 2024 competition, placed 247/992.
The award-winning working note, Addressing the Challenges of Domain Shift in Bird Call
Classification for BirdCLEF 2024, is included in this repository at ./docs/working_note/
.
This section contains the steps that need to be taken to reproduce our best submission on the private leaderboard.
Models were trained on machines with the following specifications:
- CPU: AMD Ryzen Threadripper Pro 3945WX 12-Core Processor / AMD Ryzen 9 7950X 16-Core Processor
- GPU: NVIDIA RTX A5000 / NVIDIA RTX Quadro 6000 / NVIDIA RTX A6000
- RAM: 96GB / 128GB
- OS: Ubuntu 23.10 / Arch Linux 2024.04.10
- Python: 3.10.13
Estimated training time: 1-3 hours per model on these machines.
For running inference, a machine with at least 32GB of RAM is recommended.
Make sure to clone the repository with your favourite git client or using the following command:
git clone https://github.com/TeamEpochGithub/iv-q4-detect-bird.git
First, clone the repository and navigate to the project directory. Make sure Rye is installed on your machine and run:
rye sync
Alternatively, you can manually install Python 3.10.13, set up a virtual environment,
and install the dependencies from requirements-dev.lock
using pip
:
pip install -r requirements-dev.lock
Download the competition data here or use the following command:
kaggle competitions download -c birdclef-2024
Then extract birdclef-2024.zip
to data/raw/2024/
.
Place the sounds you want to make predictions on in data/raw/2024/test_soundscapes/
.
train.py
is used to train a model. train.py
reads a configuration file from conf/train.yaml
. This configuration file
contains the model configuration to train with additional training parameters such as test_size and a scorer to use.
The model selected in the conf/train.yaml
can be found in the conf/model/
folder where a whole model configuration is stored, from preprocessing to postprocessing.
When training is finished, the model is saved in tm/
with a hash that depends on the specific preprocessing & pretraining steps, and model configuration.
You can skip this step if you only want to run inference on the test data with the model from our best submission,
as we already included this model in this repository as tm/cfd080d568b9341fa9b02decb0e59ae1_0.pt
.
If you wish to retrain this model, ensure that the model in conf/train.yaml
is set to playful-monkey-752
and run train.py
.
submit.py
runs inference on the test data from the competition given a trained model or an ensemble of trained models.
It reads a configuration file from conf/submit.yaml
which contains the model/ensemble configuration to use for inference.
Model configs can be found in conf/model/
and ensemble configs in conf/ensemble
. conf/ensemble
specifies the models (from conf/model
) to use for the ensemble and the weights to use for each model.
Ensure that the model in conf/submit.yaml
is set to playful-monkey-752
and run submit.py
to generate the submission file submission/submission.csv
.
This repository uses pre-commit with Ruff and MyPy hooks for code quality checks and auto-formatting. To install the pre-commit hooks, run:
rye run pre-commit install
To run the pre-commit checks on all files, run:
rye run pre-commit run --all-files
This repository was created by Team Epoch IV, based in the Dream Hall of the Delft University of Technology.
Read more about this competition here.