Home Knowledge Base Kullback-Leibler

Kullback-Leibler

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

A Note on the Kullback-Leibler Divergence in Discretized Empirical Distributions

new Abstract: When empirical objects are represented as discrete probability distributions, within-distribution summaries such as Shannon entropy and Hill-type diversity indices describe how probability mass is spread inside each object, while Kullback-Leibler (KL) divergence provides pairwise asymmetric information. This note focuses on the KL difference $\Delta_{\mathrm{KL}}(p,q)=D_{\mathrm{KL}}(p|q)-D_{\mathrm{KL}}(q|p)$. Although $\Delta_{\mathrm{KL}}$ can add information beyond...

arXiv CS 6d ago

LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding

Announce Type: replace Abstract: Speculative decoding accelerates autoregressive large language model (LLM) inference by using a lightweight draft model to propose candidate tokens that are then verified in parallel by the target model. The speedup is significantly determined by the acceptance rate, yet standard training minimizes Kullback-Leibler (KL) divergence as a proxy objective. While KL divergence and acceptance rate share the same global optimum, small draft models, having limited...

arXiv CS 8d ago

Well-Posed KL-Regularized Control via Wasserstein and Kalman-Wasserstein KL Divergences

arXiv:2602.02250v2 Announce Type: replace-cross Abstract: Kullback-Leibler (KL) divergence regularization is widely used in reinforcement learning, but it becomes infinite under support mismatch and can degenerate in low-noise regimes. Using a unified information-geometric framework, we introduce KL analogs by replacing the Fisher-Rao geometry in the dynamical formulation of the KL with transport-based geometries, and derive closed-form expressions for common distribution families. Between...

arXiv CS 8d ago

Self-Distilled Policy Gradient

arXiv:2606.04036v1 Announce Type: new Abstract: On-policy self-distillation, where a language model conditions on privileged context to supervise its own generations, is a promising source of dense supervision for sparse-reward reinforcement learning. Actually, it can be instantiated as an auxiliary full-vocabulary student-to-teacher reverse Kullback-Leibler divergence loss.

arXiv CS 6d ago

A machine-learning-assisted progressive digit-randomness screening framework for detecting non-random patterns in raw numerical research data

Announce Type: new Abstract: Raw numerical datasets remain less systematically examined in integrity screening than images, plagiarism, or summary-statistic inconsistencies. We developed the Fabrication-risk Digit Randomness Screening model (FDRS), a statistical and machine-learning framework for detecting non-random digit-pattern irregularities in numerical research data. FDRS integrates single- and joint-decimal-digit tests, Cramer's V, entropy metrics, Kullback-Leibler divergence,...

arXiv CS 2d ago

Generalized Guarantees for Variational Inference in the Presence of Even and Elliptical Symmetry

arXiv:2511.01064v3 Announce Type: replace-cross Abstract: Variational inference (VI) approximates a target density $p$ by the best match $q$ in a family of tractable distributions. The best variational approximation is found by minimizing a divergence between distributions, $D(p||q)$, and several divergences have been proposed as objective functions for VI, with different choices leading to different approximations. We show that even when these divergences have different minimizers, the...

arXiv CS 8d ago

Diffusion Models Observe Only Gradients: A Geometric Perspective on Score Matching Errors

arXiv:2606.06179v1 Announce Type: cross Abstract: Score-based diffusion models are typically trained by minimizing the $L^2$ score matching error, and standard theoretical analyses rely on this quantity to bound the sampling discrepancy between the learned and target distributions. We show the $L^2$ score error is not the right intrinsic measure of marginal distributional quality: a learned diffusion model can incur arbitrarily large $L^2$ score error while perfectly matching the target...

arXiv CS 5d ago

Principled Uncertainty in Clinical AI: End-to-End Bayesian Modelling and Algorithmic Equity Auditing Across Multimodal Patient Data

arXiv:2606.09789v1 Announce Type: new Abstract: Clinical artificial intelligence (AI) systems routinely produce predictions without principled quantification of uncertainty, limiting their trustworthiness in high-stakes medical environments. This paper presents an integrated research programme addressing two interconnected problems: (1) the development of a fully end-to-end Bayesian uncertainty modelling framework for multimodal clinical data, and (2) the application of calibrated...

arXiv CS 1d ago

Causally Evaluating the Learnability of Formal Language Tasks

Announce Type: new Abstract: Language models, as multi-task learners, acquire a wide range of abilities during training. A fundamental question is how much task-specific data is needed to learn a given task. Answering this for natural language is difficult: tasks are hard to delineate and can confound one another.

arXiv CS 1d ago

KLIP: localized distribution shift detection via KL-divergence with diffusion priors in Inverse Problems

arXiv:2605.31596v1 Announce Type: new Abstract: Diffusion models have shown promising performance as data-driven priors for computational imaging, as well as some capacity to detect out-of-distribution (OOD) images. However, existing approaches to OOD detection often require some knowledge of the shifted distribution, fail to detect subtle or localized distribution shifts, and operate on full images, rather than the indirect measurements available in inverse problems. We propose an OOD...

arXiv CS 9d ago