Science
Insertion Based Sequence Generation with Learnable Order Dynamics
Key Points
arXiv:2602.18695v2 Announce Type: replace Abstract: Existing insertion-based masked diffusion models that generate sequences by interleaving token insertion with unmasking use fixed schedules that are not dependent on the data. For structured sequences like graphs and molecules, learning data-dependent generation orders can improve generation quality by reducing uncertainty over the action space. We propose LoFlexMDM, an insertion-based masked diffusion model with learnable order dynamics...
arXiv:2602.18695v2 Announce Type: replace
Abstract: Existing insertion-based masked diffusion models that generate sequences by interleaving token insertion with unmasking use fixed schedules that are not dependent on the data. For structured sequences like graphs and molecules, learning data-dependent generation orders can improve generation quality by reducing uncertainty over the action space. We propose LoFlexMDM, an insertion-based masked diffusion model with learnable order dynamics that learns data-dependent insertion and unmasking rates. We generalize the discrete flow matching framework to work with variable-length sequences, propose a tractable schedule parameterization and a training objective for joint training of the generator and the target order dynamics. On De Novo and fragment-constrained molecule generation, LoFlexMDM improves sample quality over FlexMDM by up to 17.5% and 6.7%, respectively. These results show that learning the target generation order can improve insertion-based diffusion models without giving up tractable training. We open source the code at https://github.com/dhruvdcoder/LoFlexMDM.