Home › Knowledge Base › Reinforcement Learning Based Adaptive Interleaved

Reinforcement Learning Based Adaptive Interleaved

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

AdaptR1: Reinforcement Learning Based Adaptive Interleaved Thinking in Multi-hop Question Answering

Announce Type: new Abstract: Large Language Models (LLMs) have achieved remarkable performance in complex reasoning tasks through Chain-of-Thought (CoT) prompting. However, this approach often leads to ``over-thinking,'' where models generate unnecessarily long reasoning traces for simple queries and incur avoidable inference cost. While recent work has explored adaptive reasoning, existing methods typically make a single query-level decision about whether to reason.

arXiv CS 9d ago

Adaptive Information Control for Search-Augmented LLM Reasoning

Announce Type: replace Abstract: Search-augmented reasoning agents interleave multi-step reasoning with external retrieval, but uncontrolled retrieval can introduce redundant evidence, saturate the context, and destabilize reinforcement learning (RL). Existing outcome-based RL methods provide only sparse terminal rewards, offering limited guidance for intermediate information-acquisition decisions. We propose DeepControl, an adaptive information-control framework based on information...

arXiv CS 6d ago

AutoTool: Dynamic Tool Selection and Integration for Agentic Reasoning

arXiv:2512.13278v2 Announce Type: replace Abstract: Agentic reinforcement learning has advanced large language models (LLMs) to reason through long chain-of-thought trajectories while interleaving external tool use. Existing approaches assume a fixed inventory of tools, which limits the adaptability of LLM agents to new or evolving toolsets. We present AutoTool, a training framework that equips LLM agents with dynamic tool-selection capabilities throughout their reasoning trajectories.

arXiv CS 2d ago