the Edge of Stability
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Edge of Stability Selectively Shapes Learning Across the Data Distribution
new Abstract: Existing analyses of the edge of stability (EoS) treat it as a global property of optimization. We show that it is also selective: the stability constraint redistributes learning across subsets of the training distribution, amplifying progress on some groups while suppressing progress on others. Using a branching intervention that enters or exits the EoS regime from the same training state, we causally demonstrate this trade-off and identify two necessary conditions for a group...
Gradient descent at the Edge of Stability: free energy model and kinetic description of the two-layer network
Announce Type: cross Abstract: We study the dynamics of gradient descent in the Edge of Stability regime, where the learning rate is large enough to induce persistent oscillations in the loss and the sharpness. We propose a continuous-time effective model that tracks the evolution of the average trajectory coupled with the time-averaged covariance of its fast oscillations. Our analysis reveals that the natural quantity to monitor in such unstable regimes is an effective free energy, which...
Conflicting Biases at the Edge of Stability: Norm versus Sharpness Regularization
arXiv:2505.21423v3 Announce Type: replace Abstract: The remarkable generalization properties of overparameterized networks are often attributed to implicit biases, such as norm minimization at small learning rates and low sharpness in the Edge-of-Stability regime. In this work, we argue that a comprehensive understanding of the generalization performance of gradient descent requires analyzing the interaction between these various forms of implicit regularization. We empirically demonstrate...
Bi-S network origin of cation-disorder stability and dispersive band edges in AgBiS2
Announce Type: cross Abstract: Cation-disordered AgBiS2 is a promising lead-free optoelectronic material, but both its ordered structure and the microscopic origin of its favorable electronic properties remain debated. Theory has proposed a mixed-coordination tendency with tetrahedral AgS4 and octahedral BiS6 units, whereas experiments mainly report octahedrally coordinated ordered and cation-disordered phases, together with local cation off-centering. Here, we combine a machine-learning...
Impurity-driven turbulence opens a pathway to ELM-free operation and enhanced pedestal stability in tokamaks
Announce Type: new Abstract: Edge-localized modes (ELMs) impose severe transient heat, and particle loads on plasma-facing components, posing a critical challenge for steady-state operation of tokamak fusion reactors. Existing ELM control techniques either rely on externally applied perturbations or operate within narrow parameter windows, raising concerns for reactor scalability. Here we demonstrate that controlled injection of a low-Z impurity can fundamentally modify pedestal transport...
High-Dimensional Latents Should Be Diagnosed Through Phase Structure
arXiv:2606.02600v1 Announce Type: cross Abstract: We study autoencoder and variational-autoencoder latent spaces through the lens of spin-glass theory. The paper has two components. First, we formalize a latent-space spin-glass dictionary: for a fixed decoder, the reconstruction term together with a hyperspherical coordinates prior induces a Hamiltonian on the latent sphere, where latent coordinates play the role of continuous spins and the prior acts as an external magnetic field.
Ranking college football's top 100 newcomers for t...
If the upcoming 2026 college football season is anything like its predecessor, transfer quarterbacks and top freshmen will be crucial for many College Football Playoff runs. And by now, with less than 100 days until the start of the season, we can assess rosters and what players did during spring practice with their new teams. While we have analyzed the top newcomer for each Power 4 team, these rankings are regardless of teams.
Show HN: FFmpeg WebCLI – Full FFmpeg in Browser, Offline PWA, No Uploads(WASM)
A browser-based video editor powered by ffmpeg.wasm. No uploads, no servers -- all processing happens locally in your browser using WebAssembly. Live app: https://tejaswigowda.com/ffmpeg-webCLI/ - ✅
Flatland: The Adventures of Gradient Descent with Large Step Sizes
Announce Type: new Abstract: The training of neural networks often entails objective functions that are not globally $L$-smooth. For these functions, it is both theoretically and practically difficult to reply to the question: what is the largest possible step size that ensures the convergence of gradient descent (GD)? We address this longstanding open question in deep learning by providing a unifying definition of "large" step sizes that requires only local Lipschitz (or even H\"older)...
Gradient Descent with Large Step Size Restores Symmetry in Deep Linear Networks with Multi-Pathway
new Abstract: Recent analyses of multi-pathway Deep Linear Networks use Gradient Flow to predict a "winner-takes-all" specialization in which path symmetry breaks and each feature concentrates in a single pathway. In this work, we show that discrete Gradient Descent (GD) with a large step size tells a different story. We prove that single-path solutions are sharp minima, whereas distributing signals across pathways reduces sharpness by a factor that decreases with both the number of pathways...