Coarse-to-Control: Action-Token Planning for Vision-Language-Action Models

arXiv CS Monday 08 June 2026, 04:00 UTC By Jinhao Wu, Shiduo Zhang, Yicheng Liu, Xiaopeng Yu, Sixian Li, Siyin Wang, Hang Zhao, Jing Huo, Yang Gao, Jingjing Gong, Xipeng Qiu, Yu-Gang Jiang 1 min read

Key Points

arXiv:2606.07107v1 Announce Type: new Abstract: Most vision-language-action (VLA) models map observations directly to actions without explicit intermediate planning, which limits performance on long-horizon tasks where early mistakes compound. We propose Coarse-to-Control, a plan-execute VLA that introduces planning natively in the action-token space. The key idea is to let the policy first predict a compact sequence of coarse action tokens that summarize the intended future trajectory, and then generate executable action tokens conditioned on this plan. Because both planning and execution share a unified discrete action vocabulary, the plan stays close to the control manifold and provides directly actionable guidance rather than an abstract hint that must be translated back to motor commands. Experiments on LIBERO, SimplerEnv-WidowX, and real-world manipulation tasks show that action-token planning consistently improves over direct action generation, with the largest gains on long-horizon multi-stage tasks.

Vision-Language-Action Models arXiv:2606.07107v1 Announce Type: (ORG) Coarse (PERSON) VLA (ORG) LIBERO (ORG) SimplerEnv-WidowX (ORG)

Originally published by arXiv CS Read original →

Prof Kathy Willis responds to research showing that the poorest areas in the country face the deepest cuts to green spacesThe new research covered in your report (England’s poorest areas face deepest cuts to green space under planning law changes, report finds, 4 June) highlights the stark inequalities that exist across England when it comes to accessing nature-rich places and unlocking the many health, wellbeing and economic benefits that they can provide. In short, the research has found...

The Guardian UK 1h ago

The Last Evolution, by John W Campbell Jr. (1932)

The Project Gutenberg EBook of The Last Evolution, by John Wood Campbell This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.org

Hacker News 1h ago

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say In a proof-of-concept lab experiment, scientists demonstrated that intestinal parasites could make and release therapeutic agents inside a living host. Scientists genetically tweaked a tiny, worm-like parasite to produce a life-saving antitoxin from inside a living host. In a first-of-its-kind study, researchers modified the hookworm Ancylostoma ceylanicum so that it produces antibodies that...

Live Science 1h ago

Indonesia Landslides Devastated Endangered Orangutans, Study Finds

More than 5 percent of the species is estimated to have been lost when a climate-fueled storm unleashed torrents of water, mud and debris.

NYT Science 2h ago

Coarse-to-Control: Action-Token Planning for Vision-Language-Action Models

Related Stories

Link between poverty and access to nature | Letter

The Last Evolution, by John W Campbell Jr. (1932)

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say

Indonesia Landslides Devastated Endangered Orangutans, Study Finds