Home Knowledge Base KL

KL

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Well-Posed KL-Regularized Control via Wasserstein and Kalman-Wasserstein KL Divergences

arXiv:2602.02250v2 Announce Type: replace-cross Abstract: Kullback-Leibler (KL) divergence regularization is widely used in reinforcement learning, but it becomes infinite under support mismatch and can degenerate in low-noise regimes. Using a unified information-geometric framework, we introduce KL analogs by replacing the Fisher-Rao geometry in the dynamical formulation of the KL with transport-based geometries, and derive closed-form expressions for common distribution families. Between...

arXiv CS 8d ago

Online KL-Regularized Reinforcement Learning with Function Approximation under Misspecification

arXiv:2606.06053v1 Announce Type: new Abstract: We study KL-regularized contextual bandits and episodic reinforcement learning (RL) under general function approximation with model misspecification. Existing guarantees rely on realizability and therefore do not extend to misspecified models, where classical regret bounds may fail. This work introduces KL misspecification formulations for contextual bandits and episodic RL and analyzes regression-based algorithms with Gibbs policy updates.

arXiv CS 5d ago

Escaping the KL Agreement Trap in On-Policy Distillation

arXiv:2606.09471v1 Announce Type: new Abstract: On-policy distillation (OPD) provides dense token-level supervision by asking a teacher to score student-generated rollouts. However, when the student drifts into an unrecoverable prefix, the teacher may locally agree with the degraded state, producing low reverse KL but little corrective training signal. We identify this persistent regime as a low-KL agreement trap.

arXiv CS 1d ago

IND vs AFG: KL Rahul sheds T20 edginess, Sudharsan makes the most of 'long rope'

TimesofIndia.com in Mullanpur: The challenge before KL Rahul and Sai Sudharsan was not Afghanistan's bowling attack. It was the transition from two months of relentless T20 cricket back to the demands of Test cricket. One had to rediscover patience after a prolific IPL campaign, while the other had to justify the faith placed in him by the team management.

Times of India 3d ago

KL Rahul goes unsold in Maharaja Trophy auction: Here's why

India Test vice-captain KL Rahul went unsold in the Maharaja Trophy KSCA T20 2026 auction on Friday despite being one of the biggest names available. Rahul, who scored 593 runs in 14 matches for Delhi Capitals during IPL 2026, was the second player to come up in the auction. However, none of the six franchises placed a bid for him in the opening round because his availability for the tournament remains uncertain.

Times of India 5d ago

Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints

Announce Type: new Abstract: Unlearning in diffusion models aims to remove undesirable data or concepts while preserving the utility of pretrained models -- two fundamentally conflicting objectives. We propose a principled constrained optimization framework that formulates unlearning as minimizing the deviation from a pretrained model, subject to explicit separation constraints from the unlearning distributions. Specifically, we formulate three constrained optimization problems based on...

arXiv CS 9d ago

KLIP: localized distribution shift detection via KL-divergence with diffusion priors in Inverse Problems

arXiv:2605.31596v1 Announce Type: new Abstract: Diffusion models have shown promising performance as data-driven priors for computational imaging, as well as some capacity to detect out-of-distribution (OOD) images. However, existing approaches to OOD detection often require some knowledge of the shifted distribution, fail to detect subtle or localized distribution shifts, and operate on full images, rather than the indirect measurements available in inverse problems. We propose an OOD...

arXiv CS 9d ago

A Note on the Kullback-Leibler Divergence in Discretized Empirical Distributions

new Abstract: When empirical objects are represented as discrete probability distributions, within-distribution summaries such as Shannon entropy and Hill-type diversity indices describe how probability mass is spread inside each object, while Kullback-Leibler (KL) divergence provides pairwise asymmetric information. This note focuses on the KL difference $\Delta_{\mathrm{KL}}(p,q)=D_{\mathrm{KL}}(p|q)-D_{\mathrm{KL}}(q|p)$. Although $\Delta_{\mathrm{KL}}$ can add information beyond...

arXiv CS 6d ago

LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding

Announce Type: replace Abstract: Speculative decoding accelerates autoregressive large language model (LLM) inference by using a lightweight draft model to propose candidate tokens that are then verified in parallel by the target model. The speedup is significantly determined by the acceptance rate, yet standard training minimizes Kullback-Leibler (KL) divergence as a proxy objective. While KL divergence and acceptance rate share the same global optimum, small draft models, having limited...

arXiv CS 8d ago

Reward Shaping for (Inference-Time) Alignment: A Stackelberg Game Perspective

arXiv:2602.02572v2 Announce Type: replace Abstract: Existing alignment methods directly use the reward model learned from user preference data to optimize an LLM policy, subject to KL regularization with respect to the base policy. This practice is suboptimal for maximizing user's utility because the KL regularization may cause the LLM to inherit the bias in the base policy that conflicts with user preferences. While amplifying rewards for preferred outputs can mitigate this bias, it also...

arXiv CS 1d ago