Home Knowledge Base Modality-Aware Capacity Scaling

Modality-Aware Capacity Scaling

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

MACS: Modality-Aware Capacity Scaling for Efficient Multimodal MoE Inference

arXiv:2605.05225v3 Announce Type: replace Abstract: Mixture-of-Experts Multimodal Large Language Models (MoE MLLMs) suffer from a significant efficiency bottleneck during Expert Parallelism (EP) inference due to the straggler effect. This issue is worsened in the multimodal context, as existing token-count-based load balancing methods fail to address two unique challenges: (1) Information Heterogeneity, where numerous redundant visual tokens are treated equally to semantically critical ones,...

arXiv CS 2d ago