Linear Function Approximation
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Fast and Robust Convergence Rate for TD(0) with Linear Function Approximation, Universal Learning Steps and I.I.D. Samples
arXiv:2606.05967v2 Announce Type: replace-cross Abstract: In this paper, we study the finite-time behavior of the TD(0) temporal-difference method with linear function approximation (LFA). We consider on-policy independent and identically distributed (i.i.d.) samples, a constant learning step, and the Polyak-Juditsky averaging method.
Fast and Robust Convergence Rate for TD(0) with Linear Function Approximation, Universal Learning Steps and I.I.D. Samples
arXiv:2606.05967v1 Announce Type: cross Abstract: In this paper, we study the finite-time behavior of the TD(0) temporal-difference method with linear function approximation (LFA). We consider on-policy independent and identically distributed (i.i.d.) samples, a constant learning step, and the Polyak-Juditsky averaging method.
A Robust $\widetilde{\mathcal{O}}(1/\sqrt{T})$ Rate for Unprojected TD Learning with Linear Function Approximation
Announce Type: replace Abstract: We investigate the finite-time convergence properties of Temporal Difference (TD) learning with linear function approximation, a cornerstone of reinforcement learning. We are interested in the so-called ``robust'' setting, where the convergence guarantee does not depend on the potential function's minimal curvature. While prior work has established convergence guarantees in this setting, these results typically rely on the artificial assumption that each...
Adalina: Adaptive Linear Approximation for the Shapley Value and Beyond
arXiv:2604.08438v2 Announce Type: replace Abstract: The Shapley value, and its broader family of semi-values, has received much attention in various attribution problems. A fundamental and long-standing challenge is their efficient approximation, since exact computation generally requires an exponential number of utility queries in the number of players $n$. To meet the challenges of large-scale applications, we explore the limits of efficiently approximating semi-values under a $\Theta(n)$...
Target Updates May Stabilize Linear Q-Learning: Periodic and Soft Dynamics
arXiv:2606.02645v1 Announce Type: cross Abstract: Periodic target updates in Q-learning and soft target updates in actor-critic methods are empirically well established stabilization mechanisms, but their precise theoretical explanation is still incomplete. This paper gives a rigorous and exact analysis of these mechanisms for Q-learning with linear function approximation (linear Q-learning) using the exact switched linear system (SLS) dynamics induced by the Bellman maximum and the joint...
Compositional Approximation Can Strictly Outperform Superpositional Approximation
arXiv:2606.08727v1 Announce Type: new Abstract: Many classically studied function classes are known to be approximated optimally by superpositional methods, i.e. with approximants constructed as the linear combination of elements in some dictionary. Here optimality means that the uniform approximation error viewed as a function of the number of parameters used has polynomial decay of the highest order achievable by any parametrized method whose parameters can be encoded as a bit string of...
RPA as a Hessian Closure: Effective Functionals and Source-Variable Duality Across DFT, LR-TDDFT, 1RDMFT, and MBPT
Announce Type: new Abstract: We present a variational formulation of the random phase approximation (RPA) that places density functional theory (DFT), linear-response time-dependent density functional theory (LR-TDDFT), one-body reduced density matrix functional theory (1RDMFT), and Green's function many-body perturbation theory (MBPT) into a common source-variable hierarchy. The central claim is that RPA is not best defined by any one problem-specific formula, diagrammatic resummation, or...
$\mathcal{H}_2$-optimal model reduction of linear quadratic-output systems by multivariate rational interpolation
Announce Type: replace Abstract: This paper addresses the $\mathcal{H}_2$-optimal approximation of linear dynamical systems with quadratic-output functions, also known as linear quadratic-output systems. Our major contributions are threefold. First, we derive interpolatory first-order optimality conditions for the linear quadratic-output $\mathcal{H}_2$ minimization problem.
Weighted universal approximation of differentiable maps on infinite-dimensional manifolds
Announce Type: cross Abstract: We generalize the universal approximation theorem for functional input neural networks (FNN) to differentiable maps by including the approximation of the derivatives. A FNN maps the input from a possibly infinite-dimensional weighted manifold to the real-valued hidden layer, on which a non-linear scalar activation function is applied, and then returns the output into a Banach space via some linear readouts. By proving a weighted Nachbin theorem, we establish a...
Trace-Mediated Peak Bias: Bridging Temporal Credit Assignment and Cognitive Heuristics in Deep Reinforcement Learning
arXiv:2606.04735v1 Announce Type: new Abstract: Temporal credit assignment is central to both biological and artificial intelligence, yet its interaction with non-linear function approximation is poorly understood. We identify a systematic failure mode in deep reinforcement learning (RL) termed Trace-Mediated Peak Bias (TMPB). At intermediate eligibility trace depths, agents irrationally prefer trajectories with high-magnitude reward ``peaks'' over alternatives with higher cumulative returns.