RoboArena
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
A Chinese robotics start-up beat Nvidia on a global AI ranking. Is a new tech war brewing?
A Chinese robotics start-up beat Nvidia on a global AI ranking. Is a new tech war brewing? Spirit AI says its foundation model for embodied intelligence is the first from China to top the RoboArena global leaderboard As artificial intelligence steps out of the digital realm and into the real world, the race to build the embodied “brains” powering next-generation robots has become the newest battleground in tech competition between China and the United States.
Cosmos 3: Omnimodal World Models for Physical AI
arXiv:2606.02800v1 Announce Type: new Abstract: We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. By supporting highly flexible input-output configurations, Cosmos 3 seamlessly unifies critical modalities for Physical AI -- effectively subsuming vision-language models, video generators, world simulators, and world-action models into a...
OSCAR: Omni-Embodiment Skeleton-Conditioned World Action Model for Robotics
arXiv:2606.04463v1 Announce Type: new Abstract: We present OSCAR, a precise action-conditioned video world model that generalizes across different robot embodiments and enables robot policy evaluation. Existing video world models face three main challenges for real-world robot evaluation: limited scenario diversity in current robot training datasets, imprecise action following, and poor generalization across embodiments for broad adoption. We tackle these challenges from two perspectives.
OSCAR: Omni-Embodiment Action-Conditioned World Model for Robotics
arXiv:2606.04463v2 Announce Type: replace Abstract: We present OSCAR, a precise action-conditioned video world model that generalizes across different robot embodiments and enables robot policy evaluation. Existing video world models face three main challenges for real-world robot evaluation: limited scenario diversity in current robot training datasets, imprecise action following, and poor generalization across embodiments for broad adoption. We tackle these challenges from two perspectives.
Cosmos 3: Omnimodal World Models for Physical AI
arXiv:2606.02800v2 Announce Type: replace Abstract: We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. By supporting highly flexible input-output configurations, Cosmos 3 seamlessly unifies critical modalities for Physical AI -- effectively subsuming vision-language models, video generators, world simulators, and world-action models into a...