Home Knowledge Base Multi-Granular Trajectory Alignment

Multi-Granular Trajectory Alignment

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

MTA: Multi-Granular Trajectory Alignment for Large Language Model Distillation

Announce Type: replace Abstract: Knowledge distillation is a key technique for compressing large language models (LLMs), but most existing methods align representations at fixed layers or token-level outputs, ignoring how representations evolve across depth. As a result, the student is only weakly guided to capture the teacher's internal relational structure during distillation, which limits knowledge transfer. To address this limitation, we propose Multi-Granular Trajectory Alignment (MTA),...

arXiv CS 7d ago