Home › Knowledge Base › Horizon Estimation

Horizon Estimation

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Adaptive arrival cost update for improving Moving Horizon Estimation performance

arXiv:2606.04163v1 Announce Type: new Abstract: Moving horizon estimation is an efficient technique to estimate states and parameters of constrained dynamical systems. It relies on the solution of a finite horizon optimization problem to compute the estimates, providing a natural framework to handle bounds and constraints on estimates, noises and parameters. However, the approximation of the arrival cost and its updating mechanism are an active research topic.

arXiv CS 6d ago

Think Fast: Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models

arXiv:2606.07157v1 Announce Type: new Abstract: Many efforts to ensure frontier AI models are safe rely on monitoring their chain-of-thought (CoT) reasoning. If models become able to perform sufficiently complex reasoning internally, without explicit thinking tokens, this would undermine such oversight. We measure how well frontier models reason without CoT across a suite of over 30,000 questions spanning 43 benchmarks in domains including math, coding, puzzles, causality, theory-of-mind,...

arXiv CS 2d ago

Dual Advantage Fields

arXiv:2606.04188v1 Announce Type: new Abstract: Offline goal-conditioned reinforcement learning requires both long-horizon reachability estimates and local action comparisons. Dual goal representations provide value fields that capture global goal reachability, but they do not directly specify which action should be preferred at a given state. We propose Dual Advantage Fields, a policy-extraction method that turns a bilinear dual value model into a local advantage signal.

arXiv CS 6d ago

SVL: Goal-Conditioned Reinforcement Learning as Survival Learning

arXiv:2604.17551v2 Announce Type: replace Abstract: Standard approaches to goal-conditioned reinforcement learning (GCRL) that rely on temporal-difference learning can be unstable and sample-inefficient due to bootstrapping. While recent work has explored contrastive and supervised formulations to improve stability, we present a probabilistic alternative, called survival value learning (SVL), that reframes GCRL as a survival learning problem by modeling the time-to-goal from each state as a...

arXiv CS 9d ago

Managing hydrogen emissions is key to maximizing climate benefits as hydrogen use expands, say researchers

Managing hydrogen emissions is key to maximizing climate benefits as hydrogen use expands, say researchers Lisa Lock Scientific Editor Robert Egan Associate Editor Current estimates of hydrogen's climate impact are now sufficiently robust to inform policy and business decision-making, according to researchers in a new review article on the climate impacts of hydrogen emissions. Hydrogen is expected to be an important component of future low-carbon energy and industrial systems, particularly...

Phys.org 6d ago

ViVa: A Video-Generative Value Model for Robot Reinforcement Learning

arXiv:2604.08168v2 Announce Type: replace Abstract: Vision-language-action (VLA) models have advanced robot manipulation through large-scale pretraining, but real-world deployment remains challenging due to partial observability and delayed feedback. Reinforcement learning addresses this via value functions, which assess task progress and guide policy improvement. However, existing value models built on vision-language models (VLMs) struggle to capture temporal dynamics and physical...

arXiv CS 2d ago

Learned Response-Field Inertia Operator for HEC-RAS 2D Water-Surface Elevation Prediction

arXiv:2606.06385v1 Announce Type: new Abstract: This article presents a cross-dataset evaluation of learned native-cell surrogate models for solver-consistent water-surface elevation (WSE) prediction in HEC-RAS 2D. To avoid raster remapping error and information-access confounding, surrogates are evaluated directly on the original nonuniform computational cells under an explicit policy that separates static project inputs, current hydraulic state, project-input forcing, calibration-derived...

arXiv CS 5d ago

Feat2Go: Visual Feature-Grounded Value Estimation for Embodied Reinforcement Learning

arXiv:2605.30795v1 Announce Type: new Abstract: Reinforcement learning is a promising approach for improving the capabilities of vision-language-action (VLA) models while avoiding the heavy data requirements of imitation learning. However, its effectiveness for VLA models is often constrained by sparse supervision and the difficulty of designing informative reward signals for long-horizon manipulation. In this work, we present Feat2Go, a fine-grained value estimation framework for embodied...

arXiv CS 9d ago

Data- and Variance-dependent Regret Bounds for Online Tabular MDPs

arXiv:2602.01903v2 Announce Type: replace Abstract: This work studies online episodic tabular Markov decision processes (MDPs) with known transitions and develops best-of-both-worlds algorithms that achieve refined data-dependent regret bounds in the adversarial regime and variance-dependent regret bounds in the stochastic regime. We quantify MDP complexity using a first-order quantity and several new data-dependent measures for the adversarial regime, including a second-order quantity and a...

arXiv CS 7d ago

Planet nine mystery deepens as new discovery challenges hidden planet theory

Planet nine mystery deepens as new discovery challenges hidden planet theory A hidden giant planet may be lurking beyond Neptune—but every new discovery seems to deepen the mystery. - Date: - June 8, 2026 - Source: - The Conversation - Summary: -

Science Daily 1d ago