Value Flows
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Value Flows
arXiv:2510.07650v4 Announce Type: replace Abstract: While most reinforcement learning methods today flatten the distribution of future returns to a single scalar value, distributional RL methods exploit the return distribution to provide stronger learning signals and to enable applications in exploration and safe RL. While the predominant method for estimating the return distribution is by modeling it as a categorical distribution over discrete bins or estimating a finite number of...
Q-VGM: Q-Guided Value-Gradient Matching for Flow-Matching VLA Policies
Announce Type: new Abstract: We propose Q-Guided Value-Gradient Matching (Q-VGM), an off-policy reinforcement learning (RL) method that tackles a long-standing challenge in fine-tuning flow-matching vision-language-action (VLA) policies: efficiently improving an expressive flow-matching action expert with respect to a learned Q-function. Effective improvement must exploit the first-order (gradient) information of the critic, but this is difficult for flow policies, because directly...
Deep learning four decades of human migration
Abstract Human migration is a fundamental driver of global demographic change, shaping population structure, labour markets and social policy across countries1,2,3. Although long-term migration patterns are often linked to economic development4, they can shift rapidly in response to shocks such as conflict, environmental crises and political change5. Despite its importance, migration remains difficult to measure consistently: existing data are sparse, concentrated in high-income settings and...
Gradient-Flow Optimization as Dynamic Random-Effects Inference: Testing and Early Stopping with Applications to Deep Learning
arXiv:2605.27991v2 Announce Type: replace-cross Abstract: Gradient-flow optimization is usually viewed as an algorithmic procedure for minimizing empirical loss, with training duration selected by validation or heuristic early-stopping rules. We develop a statistical inference framework for the gradient-flow training trajectory itself. The central object is fixed-operator squared-error gradient flow: whenever the fitted value evolves through a time-invariant positive semidefinite training...
Convergence Analysis of Natural Power Method and Its Applications to Control
arXiv:2512.21469v2 Announce Type: replace-cross Abstract: This paper analyzes the discrete-time natural power method, demonstrating its convergence to the dominant $r$-dimensional subspace corresponding to the $r$ eigenvalues with the largest absolute values. This contrasts with the Oja flow, which targets eigenvalues with the largest real parts. We leverage this property to develop methods for model order reduction and low-rank controller synthesis for discrete-time LTI systems, proving...
Pinning on Tight Cuts: Improved Algorithm and Bounds for Unsplittable Multicommodity Flows in Outerplanar Graphs
arXiv:2606.04456v1 Announce Type: new Abstract: The multicommodity flow problem in an undirected capacitated graph $G$ is specified by a set of source-sink pairs with nonnegative demands. A flow is feasible if it routes all demands without exceeding the edge capacities, and it is unsplittable if it routes each demand along a single path. Let $\alpha$ be the smallest value such that the existence of a feasible flow implies the existence of an unsplittable flow that exceeds the edge capacities...
Basis for a hands free blood flow measurement with automated vessel focus
arXiv:2510.11060v2 Announce Type: replace Abstract: Cardiopulmonary resuscitation (CPR) is an essential tool to ensure oxygen supply during cardiac arrest, yet not quantifiable to this day. Low-quality chest compressions or wrong pressure placement go unnoticed. This paper presents a solution for the quantification of blood flow to guide first responders in their efforts.
High-beta runaway transitions in a fluid model of electromagnetic ion-temperature-gradient turbulence
arXiv:2606.04616v1 Announce Type: new Abstract: Gyrokinetic simulations of tokamak turbulence indicate that fluctuation levels increase abruptly and dramatically when the plasma beta exceeds a certain critical value. This increase in fluctuation levels coincides with a transition from a state dominated by zonal flow to one in which turbulent eddies form radially-elongated `streamers'.
Claude Fable 5
Claude Fable 5 and Claude Mythos 5 Today we’re launching Claude Fable 5: a Mythos-class1 model that we’ve made safe for general use. Fable 5’s capabilities exceed those of any model we’ve ever made generally available.
The Ü Programming Language
Ü is a statically-typed compiled programming language, designed for writing programs, which should be both reliable and fast. It has safe and unsafe code separation, compile-time correctness checks, powerful abstractions like RAII and templates, encapsulation, rich type system, lambdas, coroutines and many other useful features. Ü uses RAII for memory and resources management (no GC is involved), but manual memory management may be still used in unsafe code.