Home Knowledge Base VistaArena

VistaArena

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

VistaHop: Benchmarking Multi-hop Visual Reasoning for Visual DeepSearch

arXiv:2606.03273v1 Announce Type: new Abstract: Visual DeepSearch requires multimodal large reasoning model (MLRM) agents to answer complex visual queries by repeatedly inspecting image regions, grounding intermediate reasoning in visual evidence, and connecting fine-grained clues across long reasoning chains. However, existing benchmarks mainly focus on single-step visual understanding or static image-question answering, offering limited evaluation of iterative image inspection,...

arXiv CS 7d ago