Generative Drifting is Secretly Score Matching: a Spectral and Variational Perspective

arXiv CS Monday 01 June 2026, 04:00 UTC By Erkan Turan, Nicolas Dufour, Maks Ovsjanikov 1 min read

Key Points

Announce Type: replace Abstract: Generative Modeling via Drifting~\citep{deng2026drifting} has recently achieved state-of-the-art one-step image generation through a kernel-based drift operator, yet its success is largely empirical and its theoretical foundations remain poorly understood. We observe that \emph{under a Gaussian kernel, the drift operator is exactly a score difference on smoothed distributions}. This answers three questions left open in the original work: (1) whether a...

arXiv:2603.09936v2 Announce Type: replace Abstract: Generative Modeling via Drifting~\citep{deng2026drifting} has recently achieved state-of-the-art one-step image generation through a kernel-based drift operator, yet its success is largely empirical and its theoretical foundations remain poorly understood. We observe that \emph{under a Gaussian kernel, the drift operator is exactly a score difference on smoothed distributions}. This answers three questions left open in the original work: (1) whether a vanishing drift guarantees equality of distributions ($V_{p,q}=0\Rightarrow p=q$), (2) how to choose between kernels, and (3) why the stop-gradient operator is indispensable for stable training. Our observations position drifting within the score-matching family. By linearizing the McKean-Vlasov dynamics and probing them in Fourier space, we reveal frequency-dependent convergence timescales comparable to \emph{Landau damping} in plasma kinetic theory: the Gaussian kernel suffers an exponential high-frequency bottleneck, potentially explaining the empirical preference for the Laplacian kernel. This suggests a fix: an exponential bandwidth annealing schedule $\sigma(t)=\sigma_0 e^{-rt}$ that reduces convergence time from $\exp(O(K_{\max}^2))$ to $O(\log K_{\max})$. Finally, by formalizing drifting as a Wasserstein gradient flow of the smoothed KL divergence, we prove that the stop-gradient operator is not a heuristic but is derived from the frozen-field discretization mandated by the Jordan-Kinderlehrer-Otto (JKO) scheme, and removing it severs training from any gradient-flow guarantee. This variational perspective further provides a general template for constructing novel drift operators, which we demonstrate with a Sinkhorn divergence drift. We validate our analysis on toy datasets and scale it up to ImageNet.

Spectral (ORG) the McKean-Vlasov (LOCATION) Fourier (ORG) Laplacian (PERSON) e^{-rt}$ (ORG) K_{\max})$. (ORG) Wasserstein (LOCATION) JKO (ORG) ImageNet (ORG)

Originally published by arXiv CS Read original →

Generative Drifting is Secretly Score Matching: a Spectral and Variational Perspective

Related Stories

SpaceX courts Australian investors as government warns Elon Musk risk

Residents say Brisbane's new outer city estates missing crucial service

SpaceX Price Tag is 'Very Steep': Renaissance's Kennedy

World's biggest whale graveyard found in Indian Ocean off Australia