Home › Knowledge Base › Signal for Reasoning Dynamics

Signal for Reasoning Dynamics

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

The Tell-Tale Norm: $\ell_2$ Magnitude as a Signal for Reasoning Dynamics in Large Language Models

arXiv:2606.06188v1 Announce Type: new Abstract: Recent work has sought to understand Large Language Models (LLMs) reasoning, yet a principled, model-intrinsic signal that captures its layer-wise reasoning dynamics remains underexplored. We bridge this gap by demonstrating that the l2 norm of hidden states serves as an endogenous signal of the model's reasoning intensity. Using Sparse Autoencoders (SAEs) as a diagnostic probe, we observe that LLMs' internal reasoning is marked by a sharp...

arXiv CS 5d ago

Momentum for Reasoning: Dense Intrinsic Signals in Policy Optimization

Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has emerged as a powerful paradigm for eliciting long-chain reasoning in large language models. However, existing methods based on Group Relative Policy Optimization (GRPO) rely on a binary outcome reward, which induces two structural failure modes: Zero-Advantage Collapse, in which all rollouts in a group share the same outcome and the gradient vanishes, and Hallucinated Certainty, in which the model becomes...

arXiv CS 1d ago

Unveiling the Entropy Dynamics of Chain-of-Thought Reasoning

arXiv:2606.02020v1 Announce Type: new Abstract: This paper investigates the entropy dynamics of Chain-of-Thought (CoT) and uncovers a consistent two-phase structure: an Uncertainty Region of exploration transitioning sharply to a Confidence Region of convergence. We demonstrate that the Confidence Region possesses two critical properties: 1) High Reliability -- answers in the confidence region become highly accurate and stable, and 2) High Redundancy -- models generate unnecessary tokens...

arXiv CS 8d ago

FinTradeBench: A Financial Reasoning Benchmark for LLMs

Announce Type: replace Abstract: Real-world financial decision-making is a challenging problem that requires reasoning over heterogeneous signals, including company fundamentals derived from regulatory filings and trading signals computed from price dynamics. Recently, with advances in Large Language Models (LLMs), financial analysts have begun to use them for financial decision-making tasks. However, existing financial question-answering benchmarks for testing these models primarily focus...

arXiv CS 6d ago

IDOL: Inverse-Dynamics-Guided Future Prediction for End-to-End Autonomous Driving

arXiv:2605.31476v1 Announce Type: new Abstract: End-to-end autonomous driving has emerged as a compelling paradigm for learning planning directly from sensor observations, while recent world-model-based approaches further enrich this paradigm by enabling explicit reasoning about how the scene may evolve in the future. Yet future prediction alone does not guarantee better planning unless the predicted evolution can be converted into planning-relevant trajectory updates. Many current methods...

arXiv CS 9d ago

ForecastCompass: Guiding Agentic Forecasting with Adaptive Factor Memory

Announce Type: new Abstract: Agentic forecasting is important for decision-making in dynamic environments, but it remains challenging because agents must reason from incomplete, time-limited evidence and produce calibrated probabilities before outcomes are resolved. Memory provides a natural mechanism for transferring experience from resolved forecasts to future prediction tasks. However, existing agent-memory methods are not tailored to forecasting, as they typically store past...

arXiv CS 9d ago

Active Exploring like a Pigeon: Reinforcing Spatial Reasoning via Agentic Vision-Language Models

Announce Type: new Abstract: Enabling Vision-Language Models (VLMs) to perform spatial reasoning remains challenging. Existing approaches treat VLMs as passive observers, which is difficult for real-world applications. Moreover, reinforcement learning methods rely on sparse rewards, limiting their effectiveness for complex reasoning tasks.

arXiv CS 8d ago

Limits of Spatial Imagery Reasoning in Frontier LLM Models

Announce Type: replace Abstract: Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, yet they struggle with spatial tasks that require mental simulation, such as mental rotation. This paper investigates whether equipping an LLM with an external ``Imagery Module'' -- a tool capable of rendering and rotating 3D models -- can bridge this gap, functioning as a ``cognitive prosthetic.'' We conducted experiments using a dual-module architecture in which a reasoning...

arXiv CS 8d ago

Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short

arXiv:2606.09380v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has become a leading paradigm for improving the reasoning ability of large language models through outcome-based supervision. However, verifiable rewards frequently become uninformative at the group level: when all sampled traces of a given prompt receive identical rewards, group-relative advantage estimation provides no gradient signal, even though the traces may differ substantially in...

arXiv CS 1d ago

Adaptive Generate-Rank-Verify: Inference-Time Search with Costly Verification

Announce Type: replace Abstract: Many inference-time language-model pipelines combine a cheap reward signal with an expensive verifier, such as exact answer checking in mathematical reasoning or hidden-test execution in code generation. We formalize this setting using a learning-theoretic lens as generative active search: a cost-sensitive first-positive search problem in which a policy adaptively samples candidates from an unknown distribution, observes cheap scores, and pays for verifier...

arXiv CS 1d ago