Home Knowledge Base TX-G2

TX-G2

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

See Less, Specify More: Visual Evidence Budgets for Generalizable VLAs

Announce Type: new Abstract: Generalization remains a central bottleneck for vision-language-action (VLA) models: under distractors, appearance shifts, and semantically similar tasks, the policy must often infer local execution details from coarse instructions while also deciding which parts of the image matter for control. We present S2 (See Less, Specify More), a framework for improving VLA generalization by training the executor under a cleaner interface. Specify More preserves the...

arXiv CS 7d ago

Continuous Reasoning for Vision-Language-Action

Announce Type: replace Abstract: Natural language is a powerful reasoning medium for language and vision-language models, but it is mismatched to the granularity of continuous control. Text and explicit subgoals operate at task-level granularity, whereas vision-language-action (VLA) policies must choose actions at a much finer temporal scale; a single reasoning step can therefore span many action chunks while remaining only weakly coupled to the action needed now. This suggests a different...

arXiv CS 1d ago

See Less, Specify More: Visual Evidence Budgets for Generalizable VLAs

arXiv:2606.02735v2 Announce Type: replace Abstract: Generalization remains a central bottleneck for vision-language-action (VLA) models: under distractors, appearance shifts, and semantically similar tasks, the policy must often infer local execution details from coarse instructions while also deciding which parts of the image matter for control. We present S2 (See Less, Specify More), a framework for improving VLA generalization by training the executor under a cleaner interface. Specify...

arXiv CS 1d ago