Update-Free On-Policy Steering via Verifiers

arXiv CS Tuesday 02 June 2026, 04:00 UTC By Maria Attarian, Ian Vyse, Claas Voelcker, Jasper Gerigk, Evgenii Opryshko, Anas Almasri, Sumeet Singh, Yilun Du, Igor Gilitschenski 1 min read

Key Points

arXiv:2603.10282v2 Announce Type: replace Abstract: In recent years, Behavior Cloning (BC) has become one of the most prevalent methods for learning manipulation from human demonstrations. Despite their successes, BC policies are often brittle and struggle with precise manipulation. To overcome these issues, we propose UF-OPS, an Update-Free On-Policy Steering method that enables the robot to predict the success likelihood of its actions and adapt its strategy at execution time. We accomplish this by training verifier functions using policy rollout data obtained during an initial evaluation of the policy. These verifiers are subsequently used to steer the base policy toward actions with a higher likelihood of success. Our method improves the performance of black-box diffusion policies, without changing the base parameters, making it lightweight and flexible. We present results from both simulation and real-world data and achieve an average 49% improvement in success rate over the base policy across 5 real tasks.

BC (ORG) UF (LOCATION)

Originally published by arXiv CS Read original →

The NHLPA expects a full NHL investigation of coach Mike Babcock before the Edmonton Oilers can hire him, sources told ESPN on Tuesday. The investigation would cover Babcock's time with the Columbus Blue Jackets in 2023, when he was hired but never coached a game for the team. Hired in July 2023, Babcock resigned that September after an NHLPA investigation into claims that he violated players' privacy when he asked to see photos on their cellphones.

ESPN just now

Trump says US will strike Iran 'very hard' again today

President Donald Trump vowed a forceful response against Iran on Wednesday, saying the U.S. would be "attacking them very hard" after accusing Tehran of prolonging nuclear negotiations and targeting a U.S. helicopter. Trump was asked during the signing of the Secure America Act in the Oval Office what he meant about an earlier social media post in which he wrote that Iran has taken too long to negotiate a deal and "now they will have to pay the price!!!""We're going to be attacking them and...

Fox News Politics 18m ago

Keir Starmer calls for calm after Belfast violence as Labour chair blasts Elon Musk

Keir Starmer calls for calm after Belfast violence as Labour chair blasts Elon Musk Keir Starmer said violence in Belfast following Monday's horror attack was 'completely unjustified', while Security Minister Dan Jarvis said it was 'sickening' that ethnic minority groups were being targeted Keir Starmer has called for calm after violent scenes in Belfast, saying rioting and arson in the city were "completely unjustified". The Prime Minister said people were rightly sickened by the attack...

Daily Mirror 26m ago

Affaire Lyhanna : Darmanin s’accroche

Listen on Spotify Apple Music Deezer Sous les feux croisés des oppositions, des magistrats et des Français en colère, le ministre de la Justice joue un numéro d’équilibriste pour ne pas céder aux appels à la démission. Quelles pourraient être les conséquences pour son avenir politique ? Anthony Lattier en discute avec Elisa Bertholomey et Sarah Paillou dans Playbook Paris, le podcast de POLITICO.

Politico EU 31m ago

Update-Free On-Policy Steering via Verifiers

Related Stories

Sources: NHLPA eyes Babcock inquiry on '23 case

Trump says US will strike Iran 'very hard' again today

Keir Starmer calls for calm after Belfast violence as Labour chair blasts Elon Musk

Affaire Lyhanna : Darmanin s’accroche