AtMan- XAI on generative models

AtMan is an explainability method designed for multi-modal generative transformer models. It correlates the relevance of the input tokens to the generated output by exhaustive perturbation. To obtain the score values, it applies ATtention MANipulation throughout all layers, and measures the difference in the resulting logprobs on the target tokens. It further encorporates embedding similarity to surppress the entire entropy found at once. As depicted in following examples, one is able to highlight various discriminative features on the same input, i.p. on text as well as image-modality.

Paper Link

roadmap

continue to cleanup repo
i.p. remove Explainer class and other overhead
more examples
hf integration?

prelim

This repo includes the XAI methods AtMan, Chefer, and a Captum interface for IG, GradCam etc. for the language-model GPT-J and vision-language model MAGMA and BLIP. (Big props to Mayukh Deb.)

To install all required dependencies, run the following command, e.g. in a conda environment with python3.8:

bash startup-hook.sh

Note: further model-checkpoints will be downloaded when executing for the first time. Sometimes CLIP fails to verify on the first execution -> running again works usually.

The main folders are atman-magma, for all XAI implementations on the MAGMA model, and BLIP for all XAI implementations on the BLIP model.

examples with MAGMA

cd atman-magma

image-text/ MAGMA x AtMan

requires 1 RTX 3090

python example_explain_panda_atman.py

image-text/ MAGMA x Chefer

requires 1 A100

python example_explain_panda_chefer.py

image-text/ MAGMA x Captum IxG, ...

requires 1 A100

python example_explain_panda_captum.py

image-text/ rollout

requires 1 RTX 3090

python example_explain_attention_rollout.py

text/ GPT-J

python example_steering.py
python example_document_qa_sentence_level_explain.py

examples with BLIP

cd BLIP

image-text/ BLIP x AtMan

python explain_vqa_run.py

image-text/ BLIP x Chefer

python explain_vqa_chefer.py

Method and Evaluation

cite

@inproceedings{
deiseroth2023atman,
title={{ATMAN}: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation},
author={Bj{\"o}rn Deiseroth and Mayukh Deb and Samuel Weinbach and Manuel Brack and Patrick Schramowski and Kristian Kersting},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=PBpEb86bj7}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
BLIP		BLIP
atman-magma		atman-magma
figs		figs
magma		magma
.gitignore		.gitignore
LICENSE-legalcode.txt		LICENSE-legalcode.txt
README.md		README.md
startup-hook.sh		startup-hook.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BLIP

BLIP

atman-magma

atman-magma

figs

figs

magma

magma

.gitignore

.gitignore

LICENSE-legalcode.txt

LICENSE-legalcode.txt

README.md

README.md

startup-hook.sh

startup-hook.sh

Repository files navigation

AtMan- XAI on generative models

roadmap

prelim

examples with MAGMA

image-text/ MAGMA x AtMan

image-text/ MAGMA x Chefer

image-text/ MAGMA x Captum IxG, ...

image-text/ rollout

text/ GPT-J

examples with BLIP

image-text/ BLIP x AtMan

image-text/ BLIP x Chefer

Method and Evaluation

cite

About

Releases

Packages

Languages

License

Aleph-Alpha/AtMan

Folders and files

Latest commit

History

Repository files navigation

AtMan- XAI on generative models

roadmap

prelim

examples with MAGMA

image-text/ MAGMA x AtMan

image-text/ MAGMA x Chefer

image-text/ MAGMA x Captum IxG, ...

image-text/ rollout

text/ GPT-J

examples with BLIP

image-text/ BLIP x AtMan

image-text/ BLIP x Chefer

Method and Evaluation

cite

About

Resources

License

Stars

Watchers

Forks

Languages