Home Knowledge Base Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence

Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

MASER: Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence

arXiv:2606.02463v1 Announce Type: new Abstract: In 3D environments, Embodied Agents answer spatially relevant questions through reasoning from a mixture of modalities including natural language, RGB images, point clouds, depth maps and camera poses. Existing Vision-Language models (VLMs) are fine-tuned over a single modality. This completely ignores the question semantics which may favor a different modality than the finetuned modality.

arXiv CS 8d ago