Modal Signaling Large Model
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models
arXiv:2603.04444v3 Announce Type: replace Abstract: As large language models (LLMs) diversify across modalities, capabilities, and cost profiles, the problem of intelligent request routing -- selecting the right model for each query at inference time -- has become a critical systems challenge. We present vLLM Semantic Router, a signal-driven decision routing framework for Mixture-of-Modality (MoM) model deployments. The central innovation is composable signal orchestration: the system...
vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models
arXiv:2603.04444v4 Announce Type: replace Abstract: As large language models (LLMs) diversify across modalities, capabilities, and cost profiles, the problem of intelligent request routing: selecting the right model for each query at inference time, has become a critical systems challenge. We present vLLM Semantic Router, a signal-driven decision routing framework for Mixture-of-Modality (MoM) model deployments. The architecture follows two complementary Shannon-inspired views.
Edge-Aware Curvature Modeling for Graph Understanding in Large Language Models
arXiv:2606.06073v1 Announce Type: new Abstract: Recently, graph-aware Large Language Models (LLMs) have shown promising capabilities in jointly modeling graph-structured data and textual information. Existing approaches typically employ a graph encoder and a frozen LLM to obtain node representations from graph and textual views, followed by node-level alignment to bridge the two modalities. However, such alignment mechanisms primarily focus on node information while overlooking edge-level...
Focus Then Listen: An Empirical Study of Plug-and-Play Audio Enhancer for Noise-Robust Large Audio Language Models
arXiv:2603.04862v4 Announce Type: replace Abstract: Large audio language models (LALMs) are a class of foundation models for audio understanding. Existing LALMs tend to degrade significantly in real-world noisy acoustic conditions where speech and non-speech sounds interfere. While noise-aware fine-tuning can improve robustness, it requires task-specific noisy data and expensive retraining, limiting scalability.
SVHalluc: Benchmarking Speech-Vision Hallucination in Audio-Visual Large Language Models
arXiv:2606.02642v1 Announce Type: cross Abstract: Despite the success of audio-visual large-language models (LLMs), they can produce plausible but ungrounded outputs, termed hallucination. Existing benchmarks focus on environmental sounds (e.g., dog barking) to indicate event occurrence. In contrast, human speech carries fundamentally different, rich semantics and temporal structures, yet it remains unexplored whether current models can accurately align speech content with corresponding...
TALKPLAY: Multimodal Music Recommendation with Large Language Models
Announce Type: replace Abstract: We present TALKPLAY, a novel multimodal music recommendation system that reformulates recommendation as a token generation problem using large language models (LLMs). By leveraging the instruction-following and natural language generation capabilities of LLMs, our system effectively recommends music from diverse user queries while generating contextually relevant responses. While pretrained LLMs are primarily designed for text modality, TALKPLAY extends their...
China Mobile Jiangsu and ZTE unveil intelligent complaint analysis agent to reshape core network O&M
ZTE has joined forces with China Mobile Jiangsu under the guidance of China Mobile's Network Division to pioneer the implementation of core network complaint agent capabilities, marking a significant step forward in accelerating intelligent network operations and maintenance (O&M) transformation. Both parties innovatively introduce the multi-modal signaling model and agent technology to reconstruct the complaint handling process, implement automatic signaling analysis, and efficiently...
YARD: Y-Architecture Register Decoding for Efficient Hallucination Mitigation in Large Vision-Language Models
arXiv:2605.31429v1 Announce Type: new Abstract: Contrastive decoding (CD) seeks to mitigate hallucinations in Large Vision-Language Models (LVLMs) by contrasting the output distributions of a standard model and a visually degraded model. However, existing training-free CD methods suffer from sub-optimal degraded branches: completely dropping visual tokens is too extreme and induces language hallucinations, while corrupting input images offers coarse control over visual evidence and suffers...
ES-Merging: Biological MLLM Merging via Embedding Space Signals
arXiv:2603.14405v2 Announce Type: replace Abstract: Biological multimodal large language models (MLLMs) have emerged as powerful foundation models for scientific discovery. However, existing models are specialized to a single modality, limiting their ability to solve inherently cross-modal scientific problems. While model merging is an efficient method to combine the different modalities into a unified MLLM, existing methods rely on input-agnostic parameter space heuristics that fail to...
AnyMo: Scaling Any-Modality Conditional Motion Generation with Masked Modeling
Announce Type: replace Abstract: Conditional human motion generation remains a fundamental challenge in computer vision and robotics. Despite significant progress, current methods are often constrained by fixed modality configurations and task-specific architectures, leaving cross-modal interactions and the scaling laws of multimodal-conditioned synthesis largely underexplored. A key bottleneck is the scarcity of large-scale modality-aligned motion data, limiting generalization across...