TVSum
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
CoSTL: Comprehensive Spatial-Temporal Representation Learning for Moment Retrieval and Highlight Detection
arXiv:2606.01149v1 Announce Type: new Abstract: Video Moment Retrieval (MR) and Highlight Detection (HD) are crucial tasks in video analysis that aim to localize specific moments and estimate clip-wise relevance based on a given text query. Recent approaches treat them as similar video grounding tasks and use the same architecture to solve them. These tasks require both fine-grained comprehension at the image level and high-level temporal understanding across the entire video.
TRIM: A Self-Supervised Video Summarization Framework Maximizing Temporal Relative Information and Representativeness
Announce Type: replace Abstract: The increasing ubiquity of video content and the corresponding demand for efficient access to meaningful information have elevated video summarization and video highlights as a vital research area. However, many state-of-the-art methods depend heavily either on supervised annotations or on attention-based models, which are computationally expensive and brittle in the face of distribution shifts that hinder cross-domain applicability across datasets. We...