This repository contains explainability tools for the internal representations of large vision language models.
- [2024.10.30]: XL-VLMs repo is public.
- [2024.09.25]: Our paper A Concept based Explainability Framework for Large Multimodal Models is accepted in NeurIPS 2024.
We support the approaches introduced in the following papers:
Large multimodal models (LMMs) combine unimodal encoders and large language models (LLMs) to perform multimodal tasks. Despite recent advancements towards the interpretability of these models, understanding internal representations of LMMs remains largely a mystery. In this paper, we present a novel framework for the interpretation of LMMs. We propose a dictionary learning based approach, applied to the representation of tokens. The elements of the learned dictionary correspond to our proposed concepts. We show that these concepts are well semantically grounded in both vision and text. Thus we refer to these as "multi-modal concepts". We qualitatively and quantitatively evaluate the results of the learnt concepts. We show that the extracted multimodal concepts are useful to interpret representations of test samples. Finally, we evaluate the disentanglement between different concepts and the quality of grounding concepts visually and textually.
Please refer to docs/installation.md
for installation instructions
We support models from the transformers
library. Currently we support the following:
- LLaVA-1.5
Extending to other models should be straightforward as in
src/models/llava.py
.
Please checkout save_features.sh
, feature_decomposition.sh
, concept_dictionary_evaluation.sh
in src/examples
for more details about different commands to execute various files.
A high-level workflow while working with the repo could consist of three different steps.
-
Saving hidden states: Store hidden states for a model for a particular token of interest ('Dog', 'Cat', 'Train' etc.) from any layer via
src/save_features.py
. We also have other options/functionalities to save hidden states from various parts. Please seesrc/examples/save_features.sh
for further details. -
Multimodal concept extraction: Perform dictionary learning on stored hidden states to obtain your concept dictionary and extract information about visual/text grounding via
src/analyse_features.py
. Seesrc/examples/feature_decomposition.sh
for details. You can visualize the multimodal concepts from your saved results as illustrated inplayground/concept_grounding_visualization.ipynb
-
Evaluation: Compute the CLIPScore/Overlap metrics to evaluate your concept dictionary and its use to understand test samples via
src/analyse_features.py
. Please refer tosrc/examples/concept_dictionary_evaluation.sh
for details.
We welcome contributions to this repo. It could be in form of support for other models, datasets, or other analysis/interpretation methods for multimodal models. However, contributions should only be made via pull requests. Please refer to rules given at docs/contributing.md
If you find this repository useful please cite the following paper
@article{parekh2024concept,
title={A Concept-Based Explainability Framework for Large Multimodal Models},
author={Parekh, Jayneel and Khayatan, Pegah and Shukor, Mustafa and Newson, Alasdair and Cord, Matthieu},
journal={Advances in Neural Information Processing Systems},
volume={37},
year={2024}
}