ViTB16 Clustering

Exploring using K-Means clustering and cosine similarity matrices on image features from a vision transformer.

Requirements

CUDA 11.7
Python 3.10

Getting Started

git clone https://github.com/tsugg/ViTB16-Clustering.git

cd ViTB16-Clustering

python3 -m venv venv

source venv/bin/activate

pip install -r requirements.txt

Extra

Used visualization and cosine similarity code from here: https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/image_embeddings_analysis_part_1.ipynb
Found ViTB16 model card on hugging face: https://huggingface.co/facebook/dino-vitb16
K-Means strategy found in DINO v2 paper: https://arxiv.org/pdf/2304.07193v1.pdf \