Schulz
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
A New Method for Finding the Schulze Winner Set
arXiv:2606.02213v1 Announce Type: cross Abstract: We propose a new voting algorithm based on the pairwise majority-comparison matrix derived from voters' preference profiles. We show that this algorithm induces exactly the winner set of the Schulze rule (Schulze, 1997). Our algorithm successively eliminates weaker candidates in terms of all-pairs comparisons, thereby reflecting a dual spirit to Condorcet's original idea of splitting preference cycles (de Condorcet, 1785).
A New Method for Finding the Schulze Winner Set
Announce Type: replace-cross Abstract: We propose a new voting algorithm based on the pairwise majority-comparison matrix derived from voters' preference profiles. We show that this algorithm induces exactly the winner set of the Schulze rule (Schulze, 1997). Our algorithm successively eliminates weaker candidates in terms of all-pairs comparisons, thereby reflecting a dual spirit to Condorcet's original idea of splitting preference cycles (de Condorcet, 1785).
DeMuon: A Decentralized Muon for Matrix Optimization over Graphs
arXiv:2510.01377v2 Announce Type: replace-cross Abstract: In this paper, we propose DeMuon, a method for decentralized matrix optimization over a given communication topology. DeMuon incorporates matrix orthogonalization via Newton-Schulz iterations-a technique inherited from its centralized predecessor, Muon-and employs gradient tracking to mitigate heterogeneity among local functions. Under heavy-tailed noise conditions and additional mild assumptions, we establish the iteration complexity...
Muon$^2$: Boosting Muon via Adaptive Second-Moment Preconditioning
arXiv:2604.09967v2 Announce Type: replace Abstract: Muon has emerged as a promising optimizer for large-scale foundation model pre-training by exploiting the matrix structure of neural network updates through iterative orthogonalization. However, the orthogonalization quality of Muon hinges on the number of Newton--Schulz (NS) iterations performed, which poses efficiency challenges due to its non-trivial computation and communication cost. We propose Muon$^2$, an extension of Muon, to...
Spectral Scaling Laws of Muon
arXiv:2606.04058v1 Announce Type: new Abstract: Orthonormalized update rules have rapidly become a leading choice of optimizer for training large language models, with recent open-source state-of-the-art models adopting Muon. To keep these updates tractable, Muon performs the orthonormalization with the Newton--Schulz (NS) iteration.
Spectral Scaling Laws of Muon
arXiv:2606.04058v2 Announce Type: replace Abstract: Orthonormalized update rules have rapidly become a leading choice of optimizer for training large language models, with recent open-source state-of-the-art models adopting Muon. To keep these updates tractable, Muon performs the orthonormalization with the Newton--Schulz (NS) iteration. Since NS is only approximate, directions with small singular values fail to be orthonormalized.
LiMuon: Light and Fast Muon Optimizer for Large Models
arXiv:2509.14562v4 Announce Type: replace Abstract: Large models recently are widely applied in machine learning, so efficient training of large models has received widespread attention. More recently, the useful Muon optimizer is specifically designed for matrix-structured parameters of large models. Although some works have begun to study the Muon optimizer, the existing Muon and its variants still suffer from high sample complexity or high memory for large models.
LiMuon: Light and Fast Muon Optimizer for Large Models
arXiv:2509.14562v3 Announce Type: replace Abstract: Large models recently are widely applied in machine learning, so efficient training of large models has received widespread attention. More recently, the useful Muon optimizer is specifically designed for matrix-structured parameters of large models. Although some works have begun to study the Muon optimizer, the existing Muon and its variants still suffer from high sample complexity or high memory for large models.
A Note on Stability for Orthogonalized Matrix Momentum with Client Sampling
Announce Type: new Abstract: We study finite-sample generalization for a client-sampled distributed optimization scheme with matrix-valued parameters and orthogonalized momentum updates. The central quantity is the gap between the population and empirical objectives at the returned model when only a subset of clients participates in each round. Under independent heterogeneous client data, unequal local sample counts, and fixed aggregation weights, we derive a finite-round upper-tail...