Home Knowledge Base Object State Affordance Reasoning

Object State Affordance Reasoning

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

StateVLM: A State-Aware Vision-Language Model for Robotic Affordance Reasoning

arXiv:2605.03927v2 Announce Type: replace Abstract: Vision-language models (VLMs) have shown remarkable performance in various robotic tasks, as they can perceive visual information and understand natural language instructions. However, when applied to robotics, VLMs remain subject to a fundamental limitation inherent in large language models (LLMs): they struggle with numerical reasoning, particularly in object detection and object-state localization. To explore numerical reasoning as a...

arXiv CS 6d ago

What Objects Enable, Not What They Are: Functional Latent Spaces for Affordance Reasoning

arXiv:2606.05533v1 Announce Type: new Abstract: Existing robot planning systems rely on appearance-based reasoning, where visual observations are encoded into latent spaces organized around object appearances (e.g., recognizing a "cart" based on how it looks). However, planning requires reasoning about task-relevant functionalities of objects (e.g., whether an object is "movable"), which appearance-based latent spaces do not capture. As a result, existing approaches struggle to generalize to...

arXiv CS 5d ago

MIRAGE: Mobile Agents with Implicit Reasoning and Generative World Models

arXiv:2606.04627v2 Announce Type: replace Abstract: Mobile agents are increasingly expected to operate everyday applications from screenshots and language goals, where reliable control requires reasoning over screen affordances, multi-step navigation, and future state changes. However, many agents externalize this computation as long textual chains of thought, which slows interaction, increases supervision cost, and complicates deployment.

arXiv CS 1d ago

MIRAGE: Mobile Agents with Implicit Reasoning and Generative World Models

arXiv:2606.04627v1 Announce Type: new Abstract: Mobile agents are increasingly expected to operate everyday applications from screenshots and language goals, where reliable control requires reasoning over screen affordances, multi-step navigation, and future state changes. However, many agents externalize this computation as long textual chains of thought, which slows interaction, increases supervision cost, and complicates deployment. We introduce MIRAGE, a framework that learns continuous...

arXiv CS 6d ago

Neuro-Symbolic Learning for Long-Horizon Task Planning Under Complex Logical Constraints

Announce Type: new Abstract: Task planning often suffers from severe efficiency bottlenecks when robots must reason over long-horizon action sequences under complex logical constraints, including object affordances, spatial relationships, and sequential action dependencies. Recent neuro-symbolic methods improve planning efficiency by learning object-importance scores to prune task-irrelevant objects, but they typically rely on fixed offline supervision generated from full search spaces. This...

arXiv CS 2d ago

PhyScene3D: Physically Consistent Interactive 3D Tabletop Scene Generation

Announce Type: replace Abstract: Generating physically consistent 3D tabletop scenes is a fundamental yet underexplored problem for interactive and generalist robotic learning. The challenge stems from dense object hierarchies and irregular affordances. Here, an interactive scene denotes a physically valid, collision-free environment directly loadable into physics simulators.

arXiv CS 6d ago

PhyScene3D: Physically Consistent Interactive 3D Tabletop Scene Generation

Announce Type: new Abstract: Generating physically consistent 3D tabletop scenes is a fundamental yet underexplored problem for interactive and generalist robotic learning. The challenge stems from dense object hierarchies and irregular affordances. Here, an interactive scene denotes a physically valid, collision-free environment directly loadable into physics simulators.

arXiv CS 8d ago

Trump admin’s cancellation of wind energy projects causes business turmoil

Trump admin’s cancellation of wind energy projects causes business turmoil Seven northeastern states have sued US gov’t for paying TotalEnergies to withdraw from offshore wind projects. French energy giant TotalEnergies is embroiled in a lawsuit between seven US states and the federal government as the administration of President Donald Trump upends domestic energy policy, shutting down some wind energy projects while pushing fossil fuels. It has also raised questions about the...

Al Jazeera 5d ago

GraspFoM: Towards Reconstruction-Driven Robotic Grasping with 3D Foundation Priors

arXiv:2606.08440v1 Announce Type: new Abstract: Robotic grasping is a fundamental capability in robotic manipulation. Yet grasping remains challenging under partial observations.

arXiv CS 1d ago

PhysGraph: A Physics-aware 3D Scene Graph for Perception and Reasoning

Announce Type: new Abstract: To perform a wide range of daily tasks, robots need to construct a 3D representation that is semantically rich, physically grounded, and structured enough to support task planning and affordance prediction. However, existing approaches primarily focus on semantic retrieval, often overlooking physical and kinematic factors. Methods that attempt to model physical properties typically rely on narrow training sets or single-object modeling, limiting scalability and...

arXiv CS 1d ago