Skip to the content.

Introduction

The CLIP-Dissect Automatic Evaluation framework is designed to enhance our understanding of deep neural networks by evaluating the functionality of individual neurons. This novel approach leverages multimodal vision/language models to automatically label neurons with concepts, shedding light on how networks process information.

In this study, we address the challenge of evaluating the interpretability of neurons in deep neural networks, particularly focusing on the innovative CLIP-Dissect method that utilizes multimodal vision/language models for automatic neuron labeling. Recognizing the difficulty in quantitatively assessing hidden neurons due to the absence of ground truths, we propose an automated, qualitative evaluation approach that leverages three distinct methodologies: BLIP-2 prompting, and two OpenCLIP-based methods focusing on concept proportion and embedding similarity.

Methodology

Our methodology employs three distinct approaches to evaluate neuron labels: BLIP-2 prompting, OpenCLIP with concept proportion, and OpenCLIP with embedding similarity. Each method provides a unique perspective on neuron functionality, utilizing top activating images to assess the relevance of CLIP-Dissect labels. We use these tools on CLIP-Dissects results to evaluate each neuron label with their associated activating images.

Methodology Overview

Key Findings

Quantitative Analysis

Quantitative Analysis

Time Analysis

Sim Analysis

Discussion

The disparity between automated methods and human judgments highlights the complexity of achieving interpretability in neural networks. Our framework’s adaptability allows for integration with advanced models, ensuring the evaluation process remains state of the art.

Conclusion

The CLIP-Dissect Automatic Evaluation is a framework that provides a structured and automated means of evaluating neuron labels, in turn aiding researchers in understanding and improving neural network interpretability.