Home Knowledge Base Adaptive Dynamic Programming

Adaptive Dynamic Programming

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

A Barrier-Modulated Architecture for Safe Affine Formation Control in Second-Order Multi-Agent Systems

arXiv:2606.08137v1 Announce Type: new Abstract: Affine formation control offers immense flexibility for coordinating multi-agent maneuvers, but guaranteeing the safety of agents under parametric uncertainties remains an open challenge. This paper proposes a novel safe affine formation control framework for second-order multi-agent systems by integrating Higher-Order Control Barrier Functions (HOCBFs) with Adaptive Dynamic Programming (ADP). We introduce a barrier-modulated control...

arXiv CS 1d ago

Are Large Reasoning Models Interruptible?

Announce Type: replace Abstract: Real-world applications of Large Reasoning Models (LRMs) often require reasoning about changing prompts or environments. In this work, we challenge the frozen world assumption and evaluate LRM robustness under two realistic dynamic scenarios: interruptions, which test the accuracy of model responses under budget-constrained outputs, and dynamic context, which tests model adaptation to in-flight changes. Across mathematics and programming benchmarks that...

arXiv CS 8d ago

Agent Operating Systems (AOS): Integrating Agentic Control Planes into, and Beyond, Traditional Operating Systems

arXiv:2606.01508v1 Announce Type: new Abstract: Traditional operating systems were designed around deterministic programs, explicit control flow, and human initiated workflows. Their core abstractions processes, threads, system calls, files, and permissions assume bounded behavior and predictable interaction patterns. Agentic AI systems introduce a different execution model: long-lived, goal-directed entities that reason probabilistically, invoke tools dynamically, and adapt behavior based...

arXiv CS 8d ago

Safety-Critical Adaptive Impedance Control via Nonsmooth Control Barrier Functions under State and Input Constraints

arXiv:2605.28367v5 Announce Type: replace Abstract: Safe physical interaction is critical for deploying robotic manipulators in human-robot interaction and contact-rich tasks, where uncertainty, external forces, and actuator limitations can compromise both performance and safety. We propose an online adaptive impedance control framework that enforces joint-state safety while achieving compliant interaction under uncertain dynamics. The approach combines a quadratic-program-based safety...

arXiv CS 5d ago

Safety-Critical Adaptive Impedance Control via Nonsmooth Control Barrier Functions under State and Input Constraints

arXiv:2605.28367v3 Announce Type: replace Abstract: Safe physical interaction is critical for deploying robotic manipulators in human-robot interaction and contact-rich tasks, where uncertainty, external forces, and actuator limitations can compromise both performance and safety. We propose an online adaptive impedance control framework that enforces joint-state safety while achieving compliant interaction under uncertain dynamics. The approach combines a quadratic-program-based safety...

arXiv CS 9d ago

Safety-Critical Adaptive Impedance Control via Nonsmooth Control Barrier Functions under State and Input Constraints

arXiv:2605.28367v4 Announce Type: replace Abstract: Safe physical interaction is critical for deploying robotic manipulators in human-robot interaction and contact-rich tasks, where uncertainty, external forces, and actuator limitations can compromise both performance and safety. We propose an online adaptive impedance control framework that enforces joint-state safety while achieving compliant interaction under uncertain dynamics. The approach combines a quadratic-program-based safety...

arXiv CS 7d ago

Adaptive Sensing beyond Non-Adaptive Information Limits: End-to-End Co-Design of Geometry, Policy, and Inference

arXiv:2604.25193v2 Announce Type: replace Abstract: Inverse design has transformed vast physical parameter spaces into a substrate for emergent functionality, raising the tantalizing prospect of relocating intelligence from the digital domain into the physical world itself. Nowhere is this prospect more consequential than in sensing, where the analog-to-digital interface imposes a fundamental bottleneck: information not captured by the hardware is irrevocably lost to any downstream...

arXiv Physics 1d ago

A thalamus–brainstem attractor network drives history-biased decisions

Abstract Natural environments often change gradually, making it adaptive to bias decisions on the basis of the recent past — a phenomenon known as serial dependence1,2,3. Large-scale recordings during behaviour have identified that serial dependence is a common motif for decision-making, with neural representations of past experiences found throughout the brain4,5,6,7,8,9,10,11. However, it remains unclear whether this bias arises from dedicated neural circuits with history-specific...

Nature 18h ago

KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem

arXiv:2602.20217v2 Announce Type: replace Abstract: Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often rely on static heuristics that ignore the dynamic computational overhead of attention in long-context scenarios. We propose KnapSpec, a training-free framework that reformulates draft model selection as a knapsack problem to maximize tokens-per-time throughput. By decoupling Attention and MLP layers and...

arXiv CS 7d ago

Adaptive Generate-Rank-Verify: Inference-Time Search with Costly Verification

Announce Type: replace Abstract: Many inference-time language-model pipelines combine a cheap reward signal with an expensive verifier, such as exact answer checking in mathematical reasoning or hidden-test execution in code generation. We formalize this setting using a learning-theoretic lens as generative active search: a cost-sensitive first-positive search problem in which a policy adaptively samples candidates from an unknown distribution, observes cheap scores, and pays for verifier...

arXiv CS 1d ago