Anchor, Trace
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short
arXiv:2606.09380v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has become a leading paradigm for improving the reasoning ability of large language models through outcome-based supervision. However, verifiable rewards frequently become uninformative at the group level: when all sampled traces of a given prompt receive identical rewards, group-relative advantage estimation provides no gradient signal, even though the traces may differ substantially in...
Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States
Announce Type: replace Abstract: Linear probing of large language model (LLM) hidden states is widely used to claim that models learn distinct representations for different reasoning types. We test this by probing Qwen3-14B on three benchmarks spanning the classical trichotomy: LogiQA 2.0 (deductive), ARC-Challenge (inductive), and $\alpha$NLI (abductive). At layer 32 of 40, linear probes achieve 100\% cross-validated accuracy with well-separated geometry (intrinsic dimensionalities: 20.6,...
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models
arXiv:2602.07026v3 Announce Type: replace Abstract: Despite the success of multimodal contrastive learning in aligning visual and linguistic representations, a persistent geometric anomaly, the Modality Gap, remains: embeddings of distinct modalities expressing identical semantics occupy systematically offset regions. Prior approaches to bridge this gap are largely limited by oversimplified isotropic assumptions, hindering their application in large-scale scenarios. In this paper, we address...
Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States
arXiv:2606.02907v1 Announce Type: new Abstract: Linear probing of large language model (LLM) hidden states is widely used to claim that models learn distinct representations for different reasoning types. We test this by probing Qwen3-14B on three benchmarks spanning the classical trichotomy: LogiQA 2.0 (deductive), ARC-Challenge (inductive), and $\alpha$NLI (abductive). At layer 32 of 40, linear probes achieve 100\% cross-validated accuracy with well-separated geometry (intrinsic...
Geometric Latent Reasoning Induces Shorter Generations in LLMs
new Abstract: Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge.
Full Reverse Engineering of the TI-84 Plus Operating System
TI-84 Plus OS — Reverse-engineering notes: system overview Target: ti84plus.rom (1 MiB flash dump). OS self-identifies as 2.55MP. CPU: Zilog Z80 (16-bit address bus, 64 KiB logical space) with hardware flash/RAM paging.
From Phrygian kings to modern diplomacy: Ankara’s rise as a global cultural crossroads
Ankara is asserting its role as a vital cultural hub following its official designation as the 2026 Tourism Capital of the Turkic world. The city serves as a strategic link between Eastern and Western civilisations, a position highlighted by its historical and archaeological depth. At the Museum of Anatolian Civilisations, Deputy Manager Dr. Umut Alagöz traces this legacy through thousands of years of artefacts, identifying Anatolia as a bridge between Mesopotamia and the Mediterranean.
Debugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents
arXiv:2605.08717v2 Announce Type: replace Abstract: Software engineering agents are increasingly deployed in evaluable engineering environments, yet post-failure recovery remains costly, manual, and ad hoc. Existing systems expose traces or generate follow-up feedback, but they do not convert heterogeneous runtime evidence into grounded, bounded recovery guidance for a subsequent attempt. We present PROBE, a failure-anchored framework for structured recovery in software engineering agents.
MeerKAT reveals three electron acceleration sites in one solar flare
MeerKAT reveals three electron acceleration sites in one solar flare Sadie Harley Scientific Editor Robert Egan Associate Editor Solar flares are the most explosive energy-release events in the solar corona, leading to intense particle acceleration, plasma heating and bulk plasma motions on short timescales. Core questions during solar flares remain unresolved, including how and where particle acceleration occurs, and how energized electrons propagate through coronal magnetic structures....
Automating the Expert Eye: A System-Agnostic Deep Learning Framework for Rare Event Discovery in Imbalanced Force Spectroscopy
arXiv:2606.09541v1 Announce Type: new Abstract: Single-Molecule Force Spectroscopy (SMFS) provides unprecedented insights into biomolecular mechanics, yet the high-throughput generation of force-extension trajectories creates a severe data curation bottleneck. Identifying rare molecular unbinding events within thousands of noise-dominated curves traditionally relies on tedious, non-scalable manual auditing. Here, we present a system-agnostic, interpretable deep learning framework tailored to...