Home Knowledge Base Value Flows

Value Flows

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Value Flows

arXiv:2510.07650v4 Announce Type: replace Abstract: While most reinforcement learning methods today flatten the distribution of future returns to a single scalar value, distributional RL methods exploit the return distribution to provide stronger learning signals and to enable applications in exploration and safe RL. While the predominant method for estimating the return distribution is by modeling it as a categorical distribution over discrete bins or estimating a finite number of...

arXiv CS 8d ago

Q-VGM: Q-Guided Value-Gradient Matching for Flow-Matching VLA Policies

Announce Type: new Abstract: We propose Q-Guided Value-Gradient Matching (Q-VGM), an off-policy reinforcement learning (RL) method that tackles a long-standing challenge in fine-tuning flow-matching vision-language-action (VLA) policies: efficiently improving an expressive flow-matching action expert with respect to a learned Q-function. Effective improvement must exploit the first-order (gradient) information of the critic, but this is difficult for flow policies, because directly...

arXiv CS 1d ago

Deep learning four decades of human migration

Abstract Human migration is a fundamental driver of global demographic change, shaping population structure, labour markets and social policy across countries1,2,3. Although long-term migration patterns are often linked to economic development4, they can shift rapidly in response to shocks such as conflict, environmental crises and political change5. Despite its importance, migration remains difficult to measure consistently: existing data are sparse, concentrated in high-income settings and...

Nature 22h ago

Gradient-Flow Optimization as Dynamic Random-Effects Inference: Testing and Early Stopping with Applications to Deep Learning

arXiv:2605.27991v2 Announce Type: replace-cross Abstract: Gradient-flow optimization is usually viewed as an algorithmic procedure for minimizing empirical loss, with training duration selected by validation or heuristic early-stopping rules. We develop a statistical inference framework for the gradient-flow training trajectory itself. The central object is fixed-operator squared-error gradient flow: whenever the fitted value evolves through a time-invariant positive semidefinite training...

arXiv CS 5d ago

Convergence Analysis of Natural Power Method and Its Applications to Control

arXiv:2512.21469v2 Announce Type: replace-cross Abstract: This paper analyzes the discrete-time natural power method, demonstrating its convergence to the dominant $r$-dimensional subspace corresponding to the $r$ eigenvalues with the largest absolute values. This contrasts with the Oja flow, which targets eigenvalues with the largest real parts. We leverage this property to develop methods for model order reduction and low-rank controller synthesis for discrete-time LTI systems, proving...

arXiv CS 9d ago

Pinning on Tight Cuts: Improved Algorithm and Bounds for Unsplittable Multicommodity Flows in Outerplanar Graphs

arXiv:2606.04456v1 Announce Type: new Abstract: The multicommodity flow problem in an undirected capacitated graph $G$ is specified by a set of source-sink pairs with nonnegative demands. A flow is feasible if it routes all demands without exceeding the edge capacities, and it is unsplittable if it routes each demand along a single path. Let $\alpha$ be the smallest value such that the existence of a feasible flow implies the existence of an unsplittable flow that exceeds the edge capacities...

arXiv CS 6d ago

Basis for a hands free blood flow measurement with automated vessel focus

arXiv:2510.11060v2 Announce Type: replace Abstract: Cardiopulmonary resuscitation (CPR) is an essential tool to ensure oxygen supply during cardiac arrest, yet not quantifiable to this day. Low-quality chest compressions or wrong pressure placement go unnoticed. This paper presents a solution for the quantification of blood flow to guide first responders in their efforts.

arXiv Physics 9d ago

High-beta runaway transitions in a fluid model of electromagnetic ion-temperature-gradient turbulence

arXiv:2606.04616v1 Announce Type: new Abstract: Gyrokinetic simulations of tokamak turbulence indicate that fluctuation levels increase abruptly and dramatically when the plasma beta exceeds a certain critical value. This increase in fluctuation levels coincides with a transition from a state dominated by zonal flow to one in which turbulent eddies form radially-elongated `streamers'.

arXiv Physics 6d ago

Claude Fable 5

Claude Fable 5 and Claude Mythos 5 Today we’re launching Claude Fable 5: a Mythos-class1 model that we’ve made safe for general use. Fable 5’s capabilities exceed those of any model we’ve ever made generally available.

Hacker News 1d ago

The Ü Programming Language

Ü is a statically-typed compiled programming language, designed for writing programs, which should be both reliable and fast. It has safe and unsafe code separation, compile-time correctness checks, powerful abstractions like RAII and templates, encapsulation, rich type system, lambdas, coroutines and many other useful features. Ü uses RAII for memory and resources management (no GC is involved), but manual memory management may be still used in unsafe code.

Hacker News 6d ago