Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning

arXiv CS Wednesday 03 June 2026, 04:00 UTC By Liuji Chen, Dianxing Tang, Xing Shi, Dingshuo Chen, Qiang Liu, Shu Wu, Liang Wang 1 min read

Key Points

arXiv:2606.02132v2 Announce Type: replace Abstract: Agentic reinforcement learning can induce tool abuse, where models overuse external tools even for queries solvable by internal reasoning. Existing approaches mitigate this issue with uniform tool-use penalties or hard limits, which reduce tool frequency but may also suppress useful tool-assisted exploration. We propose EAPO, an Efficient Agentic Policy Optimization framework that learns selective tool use. EAPO introduces tool-free trajectories into each rollout group, applies difficulty-aware reward shaping to penalize redundant tool calls mainly on easier queries, and uses confidence-aware token reweighting to improve policy learning. Across nine mathematical and knowledge-intensive reasoning benchmarks, EAPO consistently improves the accuracy efficiency trade-off on Qwen2.5-3B, Qwen2.5-7B, and Llama3.1-8B. Compared with GRPO, EAPO improves average performance by 10.45%, 7.27%, and 9.69%, while reducing average tool calls by 18.33%, 18.33%, and 24.59%, respectively. These results show that agents can learn when not to use tools without compromising tool-integrated reasoning.

Agentic Reinforcement Learning arXiv:2606.02132v2 (ORG) Agentic (ORG) Efficient Agentic Policy Optimization (ORG) EAPO (ORG) GRPO (ORG)

Originally published by arXiv CS Read original →

Rep. Jasmine Crockett erupted at Republicans, calling them a "majority… White" party and accusing them of using a member of Martin Luther King Jr.’s family as a prop to shield themselves from allegations of racism while interrogating the Southern Poverty Law Center's funding of hate groups. Crockett, who is leaving office next year after a failed Senate bid, lambasted Republicans for expressing outrage at the SPLC while not addressing the "literal elephant in the room" — a reference to...

Fox News Politics 15m ago

WATCH: Hearing turmoil as Jasmine Crockett unloads on MLK's niece in wild racially-charged rant

Fox News 15m ago

Knicks fans burning sage outside MSG ahead of Game 4 to purge the bad luck left behind from Trump’s attendance

Knicks fans burning sage outside MSG ahead of Game 4 to purge the bad luck left behind from Trump’s attendance ‘It felt so dark yesterday, I was like, this is not the Garden that I know,’ the fan said - Bookmark - CommentsGo to comments The NBA Finals have New Yorkers desperate for a Knicks victory trying everything in their powers to help the team. Maybe it's the ratcheting tension as the series continues — as of this report, the Knicks are leading 2-1 against the San Antonio Spurs — but...

The Independent World 32m ago

Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning

Related Stories

Live: Angus Taylor says 'no plan' for Liberals to carve up seats with One Nation

WATCH: Hearing turmoil as Jasmine Crockett unloads on MLK's niece in wild racially-charged rant

WATCH: Hearing turmoil as Jasmine Crockett unloads on MLK's niece in wild racially-charged rant

Knicks fans burning sage outside MSG ahead of Game 4 to purge the bad luck left behind from Trump’s attendance