This repository contains the capsule-commandos' submission for the Capsule Vision 2024 Challenge. The project focuses on multi-class abnormality classification in video capsule endoscopy (VCE) frames, aiming to achieve accurate, automated detection of abnormalities. our team capsule commandos achieved 7th place ranking with a test set performance of Mean AUC: 0.7314 and balanced accuracy: 0.3235.
- Deep Learning Models: Includes Vision Transformers, CNNs, and ResNet architectures which we applied
- Data Augmentation: Utilizes resizing, flips, rotations, and normalization for robustness.
- Collaboration: Built with contributions from Dev Rishi Verma,Vibhor Saxena, Dhruv Sharma and Arpan Gupta.
The capsule-commandos repository for the Capsule Vision 2024 Challenge includes the following primary directories and files:
-
Code Files: This folder contains Jupyter notebooks with training scripts for various models aimed at improving abnormality detection. Notably, davit.ipynb is our primary submission model.
-
Evaluations: This section includes Python scripts that generate Excel files and evaluation matrices, such as ROC curves and confusion matrices. It primarily focuses on our main davit model and contains generated Excel files with evaluation results. Similar evaluations for our other models can be created using the same methods.
-
Reports: This directory holds JSON summaries of our models. The files davit_val_eval.json and davit_training_eval.json represent our final submissions.
-
CapsuleCommandos Arxiv.pdf: This is version 1 of our arXiv report, available at arXiv:2410.19973.
-
CapsuleCommandos_Arxiv_V2.pdf: This is version 2 of our arXiv report.
To access our final models, visit: Final Models on Google Drive.
- Dev Rishi Verma
- Vibhor Saxena
- Dhruv Sharma
- Arpan Gupta