I took my personal backup (mostly family photos, some random scans, etc), extracted the embeddings from ResNet (pen-ultimate layer activations), and then plotted them with the UMAP algorithm and bokeh library
This was a one-afternoon learning exercise while doing the awesome Fast AI course. The results are quite fun. ResNet returns 512 feature and UMAP maps those features into 2D plane, preserving distances as much as possible. The outliers show that the embeddings "make sense", clustering the images that are similar to other ones in the cluster, and different from those outside the cluster.
Here is a cluster of too dark pics:
Kids on bike and red background:
Some seaside:
Random landscapes:
Bathroom hardware store:
- Create thumbnails of the pictures with
process_photos.py
:
./process_photos.py -d /Volumes/MyStuff -o output/torch_thumbs -s 224 >> output/torch_thumbs.meta
# To resume interrupted session use:
#
# ./process_photos.py -d /Volumes/MyStuff -o output/torch_thumbs -s 224 -l output/torch_thumbs.meta | tee -a output/torch_thumbs.meta
- Extract features with resnet18 model with
embeddings.py
. I tried also resnet34, but the results were actually worse.
find output/torch_thumbs -name \*.jpg | ./embeddings.py -o output/features_resnet18 -m resnet18 -
-
Cast the embeddings from 512 feature space to good old 2d space with umap-learn. Visualize the results with
bokeh
. Running UMAP and the visualization is done from a Jupyter notebookvisualize.ipynb
. -
To show thumbnails when you hover over a point you need to run
python3 -m http.server
to serve the images.
Voila!