Explainability of Features Learned by Diffusion Models
Supervisor
Suitable for
Abstract
Large scale image diffusion models (such as StableDiffusion) have shown remarkable capabilities in generating and editing images and have transformed the field of generative image modelling. It has been shown that their learned features are very general and can thus be used for many downstream tasks such as segmentation and semantic correspondences. However, much less research has been done in understanding these features.
In this project, we will investigate the explainability of these features through the use of typical explainability frameworks and techniques. As these features are of a generative nature, it is possible that they contain quite different information than features from prior discriminative models.
Goals:
• Extract features from large diffusion models.
• Setup explanation methods to utilise these features.
• Find similarities and differences between those and features from other models.
Stretch Goal:
• There are many ways to extract features from diffusion models. Based on the results of the previous goals: is there a principled way to find which is the best one?
References:
Rombach, Robin, et al. "High-resolution image synthesis with latent diffusion models. 2022 IEEE." CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021.
Zhang, Junyi, et al. "A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence." arXiv preprint arXiv:2305.15347 (2023).
Laina, Iro, Yuki M. Asano, and Andrea Vedaldi. "Measuring the Interpretability of Unsupervised Representations via Quantized Reversed Probing." International Conference on Learning Representations. 2021.
Pre-requisites: Machine Learning