Home Knowledge Base Stackelberg Game Perspective

Stackelberg Game Perspective

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Reward Shaping for (Inference-Time) Alignment: A Stackelberg Game Perspective

arXiv:2602.02572v2 Announce Type: replace Abstract: Existing alignment methods directly use the reward model learned from user preference data to optimize an LLM policy, subject to KL regularization with respect to the base policy. This practice is suboptimal for maximizing user's utility because the KL regularization may cause the LLM to inherit the bias in the base policy that conflicts with user preferences. While amplifying rewards for preferred outputs can mitigate this bias, it also...

arXiv CS 1d ago