Streaming Multi
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding
Announce Type: new Abstract: While video streaming understanding has made significant strides, real-world applications, such as live sports broadcasting, autonomous driving, and multi-screen collaboration, inherently demand continuous, multi-stream interactions. However, existing benchmarks are confined to single-stream paradigms, leaving a critical gap in evaluating online, cross-stream reasoning. To bridge this, we introduce X-Stream, the first benchmark dedicated to multi-stream streaming...
X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding
arXiv:2606.02482v2 Announce Type: replace Abstract: While video streaming understanding has made significant strides, real-world applications, such as live sports broadcasting, autonomous driving, and multi-screen collaboration, inherently demand continuous, multi-stream interactions. However, existing benchmarks are confined to single-stream paradigms, leaving a critical gap in evaluating online, cross-stream reasoning. To bridge this, we introduce X-Stream, the first benchmark dedicated to...
Online Learning with Recency: Algorithms for Sliding-window Streaming Multi-armed Bandits
arXiv:2606.08977v1 Announce Type: new Abstract: Motivated by the recency effect in online learning, we study algorithms for single-pass *sliding-window streaming multi-armed bandits (MABs)* In this setting, we are given $n$ arms with unknown sub-Gaussian reward distributions and a parameter $W$. The arms arrive in a single-pass stream, and only the most recent $W$ arms are considered valid.
Streaming Communication in Multi-Agent Reasoning
arXiv:2606.05158v1 Announce Type: new Abstract: Multi-agent reasoning systems adopt a "generate-then-transfer" paradigm that forces end-to-end latency to scale linearly with pipeline depth. We introduce StreamMA, a multi-agent reasoning system that streams each reasoning step to downstream agents as soon as it is generated, pipelining adjacent agents and thus reducing latency. Surprisingly, this pipelining also improves effectiveness: because multi-step reasoning quality is non-uniform and...
Analyzing Stream Collapse in Hyper-Connections: From Diagnosis to Mitigation
arXiv:2606.03483v1 Announce Type: new Abstract: Hyper-Connections (HC) replace the single Transformer residual stream with multiple streams, introducing a permutation symmetry over stream indices. We study how this symmetry is resolved in practice: whether streams specialize in a balanced way or exhibit dominant-stream usage. Using fine-grained diagnostics for HC-based language models, we trace how multi-stream representations are actually used.
FlashTTS: Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation
arXiv:2606.09141v1 Announce Type: cross Abstract: Recent progress in speech dialogue systems requires Text-to-Speech (TTS) models to be faster and more responsive. Modern speech dialogue systems impose two primary requirements on TTS models: low latency and support for streaming inputs and outputs. However, most existing single-codebook LLM-based TTS methods rely on multi-stage pipelines that lack native streaming capabilities.
Self-supervised Learning Matters: A Simple Ensemble Solution for Micro-Gesture Recognition
arXiv:2606.09261v1 Announce Type: new Abstract: In this paper, we present XInsight Lab's solution to the micro-gesture classification track of the 4th MiGA Challenge at IJCAI 2026, in which our solution ranked first and achieved a new state-of-the-art result. We propose a multimodal ensemble framework that integrates a self-supervised RGB-based model with supervised multi-stream models from previous solutions. The self-supervised RGB model is pretrained on 120K unlabeled clips via masked...
The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models
Announce Type: new Abstract: Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discrete output. By analyzing the residual stream geometry during multi-operand addition, we identify the Iso-Raw-Sum Trajectory (IRST), a geometric structure where representations are anchored by semantic digits and modulated by continuous carry fibers. We propose the Noisy Quantization Model to explain this geometry, framing...
An Enhanced "Flux-Corrected Transport"-Based Plasmasphere Refilling Model
Announce Type: replace Abstract: A previously developed multi-ion, two-stream Flux-Corrected Transport (FCT) hydrodynamic model for plasmasphere refilling has been extended to incorporate self-consistent electron temperature evolution. The past assumption of a constant temperature along the modeled flux tube has been replaced by solving the electron energy equation, permitting spatially and temporally varying temperature. This improvement provides a more physically complete representation of...
OmniCap-IF: Benchmarking and Improving Instruction Following Abilities for Omni-Video Captioning
arXiv:2606.08572v1 Announce Type: new Abstract: While Omni-modal Large Language Models (OLLMs) have demonstrated impressive capabilities in jointly processing audio and visual streams, their ability to strictly adhere to complex, multi-faceted user instructions remains largely unexplored. Existing benchmarks primarily focus on holistic video understanding or text-only instruction following, failing to capture the intricate interplay between modalities and user constraints. To bridge this...