Uncovering Model Processing Strategies with Non-Negative Per-Example Fisher Factorization

arXiv CS Friday 05 June 2026, 04:00 UTC By Michael Matena, Colin Raffel 1 min read

Key Points

Announce Type: replace Abstract: We introduce NPEFF (Non-Negative Per-Example Fisher Factorization), an interpretability method that aims to uncover strategies used by a model to generate its predictions. NPEFF decomposes per-example Fisher matrices using a novel decomposition algorithm that learns a set of components represented by learned rank-1 positive semi-definite matrices. Through a combination of human evaluation and automated analysis, we demonstrate that these NPEFF components...

arXiv:2310.04649v3 Announce Type: replace Abstract: We introduce NPEFF (Non-Negative Per-Example Fisher Factorization), an interpretability method that aims to uncover strategies used by a model to generate its predictions. NPEFF decomposes per-example Fisher matrices using a novel decomposition algorithm that learns a set of components represented by learned rank-1 positive semi-definite matrices. Through a combination of human evaluation and automated analysis, we demonstrate that these NPEFF components correspond to model processing strategies for a variety of language models and text processing tasks. We further show how to construct parameter perturbations from NPEFF components to selectively disrupt a given component's role in the model's processing. Along with conducting extensive ablation studies, we include experiments to show how NPEFF can be used to analyze and mitigate collateral effects of unlearning and use NPEFF to study in-context learning. Furthermore, we demonstrate the advantages of NPEFF over baselines such as gradient clustering and using sparse autoencoders for dictionary learning over model activations. We release the code used in this work.

NPEFF (PERSON) Fisher (ORG)

Originally published by arXiv CS Read original →

Uncovering Model Processing Strategies with Non-Negative Per-Example Fisher Factorization

Related Stories

'Voltron: Legendary Defender' turns 10 today, and we think this mecha robot reboot was just as good as 'Power Rangers' and 'Transformers'

Exclusive-GM may ditch LFP batteries for future EVs

Musk Stock Fans Say ‘The More, The Better’ in SpaceX IPO Frenzy

Musk Stock Fans Say ‘The More, The Better’ in SpaceX IPO Frenzy