Home Knowledge Base d2-AnyOrder

d2-AnyOrder

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

d2: Improving Reasoning in Diffusion Language Models via Trajectory Likelihood Estimation

arXiv:2509.21474v4 Announce Type: replace Abstract: While diffusion language models (DLMs) have achieved competitive performance in text generation, improving their reasoning ability with reinforcement learning remains an active research area. Here, we introduce d2, a reasoning framework tailored for masked DLMs. Central to our framework is a new policy gradient algorithm that relies on accurate estimates of the sampling trajectory likelihoods.

arXiv CS 8d ago