Adaptive Dense Evidence Refinement for Video Relational
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Adaptive Dense Evidence Refinement for Video Relational Reasoning for VRR-QA Challenge
Announce Type: new Abstract: VRR-QA evaluates whether video-language systems can infer spatial, temporal, viewpoint, depth, and visibility relations that are not always resolved by a single frame. We present an inference-only system built around adaptive test-time computation. The system first answers each question with a direct video-language model pass, then uses multiple lightweight views to find unstable questions.