Home Knowledge Base RewardDance

RewardDance

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions

arXiv:2606.09076v1 Announce Type: new Abstract: Reward models are central to text-to-image post-training, but visual preference is subjective and better represented as a distribution over rubric scores than as a deterministic scalar. Existing scalar, score-token, and pairwise reward models over-compress uncertainty and fine-grained score differences, while reasoning-based generative rewards provide stronger judgments but are costly to deploy and difficult to use as direct optimization...

arXiv CS 1d ago