Home Knowledge Base Ablations

Ablations

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

AblationBench: Evaluating Automated Planning of Ablations in Empirical AI Research

Announce Type: replace Abstract: Language model agents are increasingly used to automate scientific research, yet evaluating their scientific contributions remains a challenge. A key mechanism to obtain such insights is through ablation experiments. To this end, we introduce AblationBench, a benchmark suite for evaluating agents on ablation planning tasks in empirical AI research.

arXiv CS 8d ago

Closure-Validated Circuit Discovery in Attention Heads: Co-activation Proposes, Ablation Disposes

arXiv:2606.09607v1 Announce Type: new Abstract: Interpretability increasingly treats groups of components, not individual units, as the basic object, and proposes to find them by clustering co-activation statistics. We ask whether such a cheap signal actually identifies an attention-head circuit. Adapting a sparse-autoencoder clustering recipe to attention heads -- but validating by causal ablation rather than reconstruction -- we cluster heads and then run a closure test: ablate the...

arXiv CS 1d ago

Component Ablation for Efficient Hybrid Language Model Architectures: Performance, Resilience, and Compression Implications

arXiv:2603.22473v2 Announce Type: replace Abstract: Hybrid language models combine softmax attention with linear-time sequence mechanisms such as state-space or linear-attention layers, but the functional contribution of each component type remains insufficiently characterized. We study component-level ablation in two sub-1B hybrid language models, Qwen3.5-0.8B and Falcon-H1-0.5B, using likelihood-based evaluation, downstream benchmarks, layer-wise interventions, random controls, and...

arXiv CS 1d ago

Ablation Study of Block Size, Weight Precision, and Scale Precision in NVFP4 Inference for Low-Power Edge-Efficient Neural Networks

Announce Type: new Abstract: Energy-efficient edge inference requires reducing arithmetic cost, memory traffic, and hardware overhead. This paper presents an ablation-focused study of NVFP4 LUT-based inference for edge-efficient neural networks. The proposed NVLUT framework combines 4-bit NVFP4 activations, two-level scaling, LUT-based mantissa computation, voltage-scaled storage, and selective ECC protection.

arXiv CS 2d ago

Ablation-Reversible Heads Don't Transfer: A Stress Test for Mechanistic Role Claims in Transformers

Announce Type: new Abstract: In mechanistic interpretability, attention heads are commonly elevated to role claims (e.g., "this head represents addition") when they are necessary for a behavior, encode it linearly, and recover that behavior when restored after ablation. We show this evidence is insufficient: across three 7-8B instruction-tuned models and five computation families, heads passing all three checks routinely fail to transfer the computation when their activations are patched...

arXiv CS 1d ago

AI Scientists Are Only as Good as Their Evidence: A Stratified Ablation of Proprietary Data and Reasoning Skills in Drug-Asset Valuation

arXiv:2606.09556v1 Announce Type: new Abstract: AI Scientist agents are often evaluated as if capability were mainly a function of model quality, prompting, or reasoning scaffolds. We test a different hypothesis in drug-asset valuation: for knowledge-intensive scientific decisions, the limiting factor is often the evidence substrate the agent can access. We run a controlled three-arm ablation on a production valuation agent: A is a plain web-only LLM analyst, B adds public structured tools...

arXiv CS 1d ago

Measuring, Localizing, and Ablating Alignment Signatures in LLMs

arXiv:2605.30526v1 Announce Type: new Abstract: Aligned language models often exhibit a recognizable AI-like style, yet its connection to post-training and internal representations remains poorly understood. In this work, we study whether post-training introduces or amplifies AI-like stylistic regularities and whether these regularities have a localized internal signature. To this end, we compare human text, base-model generations, and aligned-model generations under matched human-source...

arXiv CS 9d ago

Neuron-Anchored Rule Extraction for Large Language Models via Contrastive Hierarchical Ablation

arXiv:2605.03058v2 Announce Type: replace Abstract: A central goal of explainable AI is to express large language model (LLM) decision logic symbolically and ground it in internal mechanisms. Existing rule-extraction methods usually learn ungrounded symbolic surrogates, while mechanistic interpretability links behavior to neurons but often requires hand-crafted hypotheses and costly interventions. We introduce MechaRule, a pipeline that grounds rule extraction in LLM circuits by localizing...

arXiv CS 1d ago

Type-Error Ablation and AI Coding Agents

Announce Type: new Abstract: Programming language implementors have designed error messages with one consumer in mind: the human programmer. Human-factors research has consistently found that programmers engage with error messages poorly -- they skim, miss key information, and are easily overwhelmed. The practical consequence has been a strong design pressure toward brevity: messages should be terse enough that programmers will actually read them.

arXiv CS 8d ago

Ablating Archetypes: The Stability of Archetypal SAEs is an Artifact of Initialization and Metric Design

arXiv:2606.02061v1 Announce Type: new Abstract: Dictionary learning with sparse autoencoders (SAEs) produces overcomplete bases from neural network activations that are often interpretable and reduces polysemanticity. However, features from SAEs vary substantially across random seeds -- a problem known as instability. Archetypal SAEs (Fel et al., 2025) were proposed as a general dictionary-learning intervention for more reliable concept extraction, and report more stable dictionaries at the...

arXiv CS 8d ago