Education
Moebius: 0.2B image inpainting model with 10B-level performance
Key Points
Overall pipeline of Moebius. We adopt the Latent Diffusion Model (LDM) framework equipped with Latent Categories Guidance (LCG). To achieve extreme architectural efficiency, the denoising U-Net is systematically restructured using our proposed LλM I blocks (detailed in Sec. 3.2).
Overall pipeline of Moebius. We adopt the Latent Diffusion Model (LDM) framework equipped with Latent Categories Guidance (LCG). To achieve extreme architectural efficiency, the denoising U-Net is systematically restructured using our proposed LλM I blocks (detailed in Sec. 3.2). Furthermore, an adaptive multi-granularity distillation strategy (Sec. 3.3) is applied during training to align our lightweight specialist with the high-capacity teacher, successfully mitigating the capacity drop caused by extreme structural compression.
@misc{DuanAndXu2026Moebius,
title={Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance},
author={Kangsheng Duan and Ziyang Xu and Wenyu Liu and Xiaohu Ruan and Xiaoxin Chen and Xinggang Wang},
year={2026},
eprint={2606.19195},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2606.19195},
}
Moebius (ORG)
LCG (ORG)
U-Net (ORG)
Sec (ORG)
Lightweight Image Inpainting Framework (ORG)
Duan (PERSON)
Ziyang Xu (PERSON)
Wenyu Liu (PERSON)
Xiaohu Ruan (PERSON)
Xiaoxin Chen (PERSON)
Xinggang Wang (PERSON)
year={2026 (PERSON)
archivePrefix={arXiv (ORG)
CV (ORG)
url={https://arxiv.org/abs/2606.19195 (ORG)