Home Knowledge Base MSVD

MSVD

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Towards Accurate Emotion-Attributed Video Captioning via Fine-grained Emotion-Cause Pair Extraction

arXiv:2606.08566v1 Announce Type: new Abstract: Emotional Video Captioning (EVC) is a challenging task that aims to generate factually accurate and emotionally rich descriptions for videos. Existing EVC methods leverage holistic visual features to mine global emotional cues, and then aggregate multimodal features to guide the emotional caption generation, which ignores the critical characteristic of the EVC task. Visual emotions are evoked by specific motivational causes, which are usually...

arXiv CS 1d ago

MAVIS: Multi-Agent Video Retrieval via Structured Video Understanding

arXiv:2606.09641v1 Announce Type: new Abstract: The dominant paradigm in video retrieval relies on embedding-based full-corpus scanning, which suffers from inherent computational inefficiency and the semantic asymmetry between information-dense videos and sparse textual queries. To bridge this gap, we introduce \textbf{MAVIS}, a novel multi-agent framework that rethinks retrieval as cooperative reasoning rather than brute-force search. MAVIS first bridges the granularity mismatch by parsing...

arXiv CS 1d ago