Home Knowledge Base Scene-CoT

Scene-CoT

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model

Announce Type: replace Abstract: Vision-Language Models (VLMs) are increasingly deployed as the perception and reasoning backbone of autonomous agents acting in the wild, with autonomous driving (AD) being one of the most safety-critical instances. Reliable temporal understanding is essential for such agents to anticipate events, attribute causes, and act safely in dynamic environments, yet this remains a significant challenge even for state-of-the-art (SoTA) VLMs. Prior video benchmarks...

arXiv CS 8d ago

From Segments to Scenes: Temporal Understanding for Agentic Autonomous Driving via Vision-Language Models

Announce Type: replace Abstract: Vision-Language Models (VLMs) are increasingly deployed as the perception and reasoning backbone of autonomous agents acting in the wild, with autonomous driving (AD) being one of the most safety-critical instances. Reliable temporal understanding is essential for such agents to anticipate events, attribute causes, and act safely in dynamic environments, yet this remains a significant challenge even for state-of-the-art (SoTA) VLMs. Prior video benchmarks...

arXiv CS 6d ago