AtMan - Making LLMs Trustworthy with Attention Manipulation

What is it? AtMan is an attempt at helping users know the source of the answer/completion given by LLMs (Large Language Models). Unlike other existing methods, AtMan is memory efficient and is also multimodal (i.e it works on both images and text). Existing methods to explain the outputs of neural networks are generally classified into one of the 2 given types: Gradient based methods: Require a backward pass, which takes up almost double the memory as that of a forward pass....

January 30, 2023 ยท Mayukh Deb