Skip to content

Aleph-Alpha/AtMan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AtMan- XAI on generative models

AtMan is an explainability method designed for multi-modal generative transformer models. It correlates the relevance of the input tokens to the generated output by exhaustive perturbation. To obtain the score values, it applies ATtention MANipulation throughout all layers, and measures the difference in the resulting logprobs on the target tokens. It further encorporates embedding similarity to surppress the entire entropy found at once. As depicted in following examples, one is able to highlight various discriminative features on the same input, i.p. on text as well as image-modality.

Demonstrating AtMan XAI on model generation

Paper Link

roadmap

  • continue to cleanup repo
  • i.p. remove Explainer class and other overhead
  • more examples
  • hf integration?

prelim

This repo includes the XAI methods AtMan, Chefer, and a Captum interface for IG, GradCam etc. for the language-model GPT-J and vision-language model MAGMA and BLIP. (Big props to Mayukh Deb.)

To install all required dependencies, run the following command, e.g. in a conda environment with python3.8:

bash startup-hook.sh

Note: further model-checkpoints will be downloaded when executing for the first time. Sometimes CLIP fails to verify on the first execution -> running again works usually.

The main folders are atman-magma, for all XAI implementations on the MAGMA model, and BLIP for all XAI implementations on the BLIP model.

examples with MAGMA

cd atman-magma

image-text/ MAGMA x AtMan

requires 1 RTX 3090

python example_explain_panda_atman.py

image-text/ MAGMA x Chefer

requires 1 A100

python example_explain_panda_chefer.py

image-text/ MAGMA x Captum IxG, ...

requires 1 A100

python example_explain_panda_captum.py

image-text/ rollout

requires 1 RTX 3090

python example_explain_attention_rollout.py

text/ GPT-J

python example_steering.py
python example_document_qa_sentence_level_explain.py

examples with BLIP

cd BLIP

image-text/ BLIP x AtMan

python explain_vqa_run.py

image-text/ BLIP x Chefer

python explain_vqa_chefer.py

Method and Evaluation

steering and measuring

embed similarity and squas

vqa

performance

quantitative

cite

@inproceedings{
deiseroth2023atman,
title={{ATMAN}: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation},
author={Bj{\"o}rn Deiseroth and Mayukh Deb and Samuel Weinbach and Manuel Brack and Patrick Schramowski and Kristian Kersting},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=PBpEb86bj7}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published