Skip to content

Latest commit

 

History

History
22 lines (14 loc) · 1.39 KB

related-work.md

File metadata and controls

22 lines (14 loc) · 1.39 KB

Related Work

Various papers and internet posts on training SAEs for vision.

Preprints

An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation

  • Haven't read this yet, but Hugo Fry is an author.

LessWrong

Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers

  • Trains a sparse autoencoder on the 22nd layer of a CLIP ViT-L/14. First public work training an SAE on a ViT. Finds interesting features, demonstrating that SAEs work with ViTs.

Interpreting and Steering Features in Images

  • Havne't read it yet.

Case Study: Interpreting, Manipulating, and Controlling CLIP With Sparse Autoencoders

  • Followup to the above work; haven't read it yet.

A Suite of Vision Sparse Autoencoders

  • Train a sparse autoencoder on various layers using the TopK with k=32 on a CLIP ViT-L/14 trained on LAION-2B. The SAE is trained on 1.2B tokens including patch (not just [CLS]). Limited evaluation.