Home Knowledge Base Boundary-Guided Policy Optimization for Memory

Boundary-Guided Policy Optimization for Memory

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models

arXiv:2510.11683v3 Announce Type: replace Abstract: A key challenge in applying reinforcement learning (RL) to diffusion large language models (dLLMs) is the intractability of their likelihood functions, which are essential for the RL objective, necessitating corresponding approximation during training. While existing methods approximate the log-likelihoods by their evidence lower bounds (ELBOs) via customized Monte Carlo (MC) sampling, they incur significant memory overhead due to the need...

arXiv CS 9d ago