Skip to content

jakub-m/umap-emb-photos

Repository files navigation

Visualize ResNet embeddings with UMAP

I took my personal backup (mostly family photos, some random scans, etc), extracted the embeddings from ResNet (pen-ultimate layer activations), and then plotted them with the UMAP algorithm and bokeh library

This was a one-afternoon learning exercise while doing the awesome Fast AI course. The results are quite fun. ResNet returns 512 feature and UMAP maps those features into 2D plane, preserving distances as much as possible. The outliers show that the embeddings "make sense", clustering the images that are similar to other ones in the cluster, and different from those outside the cluster.

Here is a cluster of too dark pics:

img

Kids on bike and red background:

img

Some seaside:

img

Random landscapes:

img

Bathroom hardware store:

img

To recreate

  1. Create thumbnails of the pictures with process_photos.py:
./process_photos.py -d /Volumes/MyStuff -o output/torch_thumbs -s 224 >> output/torch_thumbs.meta

# To resume interrupted session use:
#
# ./process_photos.py -d  /Volumes/MyStuff -o output/torch_thumbs -s 224 -l output/torch_thumbs.meta  | tee -a output/torch_thumbs.meta
  1. Extract features with resnet18 model with embeddings.py. I tried also resnet34, but the results were actually worse.
find output/torch_thumbs -name \*.jpg | ./embeddings.py -o  output/features_resnet18 -m resnet18 -
  1. Cast the embeddings from 512 feature space to good old 2d space with umap-learn. Visualize the results with bokeh. Running UMAP and the visualization is done from a Jupyter notebook visualize.ipynb.

  2. To show thumbnails when you hover over a point you need to run python3 -m http.server to serve the images.

Voila!

HN | Tw | Blog

About

Visualize ResNet embeddings with UMAP and bokeh

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages