Mollified Value Learning

arXiv CS Monday 01 June 2026, 04:00 UTC By Hrishikesh Viswanath, Juanwu Lu, S. Talha Bukhari, Mihir Chauhan, Damon Conover, Ziran Wang, Aniket Bera 1 min read

Key Points

arXiv:2602.23280v2 Announce Type: replace Abstract: Offline goal-conditioned reinforcement learning (GCRL) learns goal-reaching behaviors from static datasets, but accurate value estimation remains challenging under limited state-action coverage. Existing physics-informed approaches address this by imposing pointwise distance-like geometric constraints derived from Hamilton--Jacobi--Bellman (HJB) optimality principles, often through first-order partial differential equations such as the Eikonal equation. However, enforcing local consistency through explicit differential structure can become unstable in complex, high-dimensional environments. Our key insight is to instead reinterpret distance-like constraints as an expectation over a local spatial measure. By aggregating constraints over this measure rather than evaluating them pointwise, the objective acts as a spatial mollifier, inducing distance-like value geometry without requiring expensive differential operators. We refer to this as Mollified Value Learning (MVL). Experiments across navigation and high-dimensional robotic manipulation tasks show that MVL learns structured, value representations, improving goal-reaching performance, when used with implicit value representation learning methods. Open-source codes are available at https://github.com/HrishikeshVish/MVL.

Mollified Value Learning arXiv:2602.23280v2 Announce Type: (ORG) Hamilton--Jacobi--Bellman (ORG) Mollified Value Learning (ORG) MVL (ORG)

Originally published by arXiv CS Read original →

Mollified Value Learning

Related Stories

Musk Stock Fans Say ‘The More, The Better’ in SpaceX IPO Frenzy

Whale graveyard dating back five million years discovered

Whale graveyard dating back five million years discovered

SpaceX Leaves Some Banks Peeved at Junior Roles in IPO Lineup