Home › Business & Finance › Ablation-Reversible Heads Don't Transfer: A Stress Test...

Business & Finance

Ablation-Reversible Heads Don't Transfer: A Stress Test for Mechanistic Role Claims in Transformers

arXiv CS Tuesday 09 June 2026, 04:00 UTC By Philip Quirke 1 min read

Key Points

Announce Type: new Abstract: In mechanistic interpretability, attention heads are commonly elevated to role claims (e.g., "this head represents addition") when they are necessary for a behavior, encode it linearly, and recover that behavior when restored after ablation. We show this evidence is insufficient: across three 7-8B instruction-tuned models and five computation families, heads passing all three checks routinely fail to transfer the computation when their activations are patched...

arXiv:2606.08292v1 Announce Type: new Abstract: In mechanistic interpretability, attention heads are commonly elevated to role claims (e.g., "this head represents addition") when they are necessary for a behavior, encode it linearly, and recover that behavior when restored after ablation. We show this evidence is insufficient: across three 7-8B instruction-tuned models and five computation families, heads passing all three checks routinely fail to transfer the computation when their activations are patched into a different prompt under matched controls. We introduce KID (Knowing / Intent / Doing), a role-assignment lens for attention heads, and pair it with a three-stage pipeline: capability-selective screening (CSS), singular value decomposition (SVD), and activation transduction under matched controls. Our results document a preliminary role taxonomy (including prompt-trajectory stabilizers, answer-side logit-bias heads, and soft computation-pattern carriers) and show that the same-answer control (a transduction target sharing the answer string but not the requested computation) is an underused check that exposes broad state transfer masquerading as semantic specificity.

CSS (ORG) SVD (ORG)

Originally published by arXiv CS Read original →

Ablation-Reversible Heads Don't Transfer: A Stress Test for Mechanistic Role Claims in Transformers

Related Stories

What all 30 MLB teams must do before the trade dea...

The World Cup will likely be the biggest gambling event in history

Social Security could face an automatic 22% cut in 2032. These 4 moves will protect your retirement.

Labour urged to ‘call time’ on the triple lock to cut welfare spending