Home Knowledge Base MGSD

MGSD

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Learning Visual Spatial Planning from Symbolic State via Modality-Gap-Aware Self-Distillation

Announce Type: new Abstract: While vision-language models excel at general multimodal understanding, they still struggle with visual spatial planning. We attribute this to a perception-reasoning modality gap: visual planning requires models to infer latent state structures from pixels and then reason over the recovered structure to produce valid actions, whereas symbolic planning directly leverages explicit objects and constraints. This creates dual bottlenecks in visual state recovery and...

arXiv CS 5d ago

Learning Visual Spatial Planning from Symbolic State via Modality-Gap-Aware Self-Distillation

Announce Type: replace Abstract: While vision-language models excel at general multimodal understanding, they still struggle with visual spatial planning. We attribute this to a perception-reasoning modality gap: visual planning requires models to infer latent state structures from pixels and then reason over the recovered structure to produce valid actions, whereas symbolic planning directly leverages explicit objects and constraints. This creates dual bottlenecks in visual state recovery...

arXiv CS 1d ago