Matrix Optimization
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
DeMuon: A Decentralized Muon for Matrix Optimization over Graphs
arXiv:2510.01377v2 Announce Type: replace-cross Abstract: In this paper, we propose DeMuon, a method for decentralized matrix optimization over a given communication topology. DeMuon incorporates matrix orthogonalization via Newton-Schulz iterations-a technique inherited from its centralized predecessor, Muon-and employs gradient tracking to mitigate heterogeneity among local functions. Under heavy-tailed noise conditions and additional mild assumptions, we establish the iteration complexity...
When Good Enough Is Optimal: Multiplication-Only Matrix Inversion Approximation for Quantized Gated DeltaNet
Announce Type: new Abstract: Matrix inversion in chunk-wise parallel linear attention is a major bottleneck for long-context modeling, particularly on NPUs, where forward-substitution-based methods exhibit limited parallelism and poor hardware utilization. We propose a fast, Matrix Multiplication (MatMul)-based algorithm tailored for strictly lower-triangular matrices arising in chunk-wise linear attention. Motivated by the rapid growth of Neumann-series terms and the diagonal concentration...
Optimal Control and Dissipativity of Linear Hermitian Matrix-Valued Dynamical Systems
arXiv:2606.08856v1 Announce Type: cross Abstract: We develop a unified framework for linear-cost optimal control, finite-time optimal steering, dissipativity analysis, and zero-sum differential games for linear impulsive systems whose state is a Hermitian matrix evolving in $\mathbb{H}^{n+m}_{\succeq0}$, a class that encompasses continuous- and discrete-time linear systems and switched systems as degenerate cases, and includes the second-order moment dynamics of linear (stochastic) hybrid...
POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation
arXiv:2603.05500v2 Announce Type: replace Abstract: Efficient and stable training of large language models (LLMs) remains a core challenge in modern machine learning systems. To address this challenge, Reparameterized Orthogonal Equivalence Training (POET), a spectrum-preserving framework that optimizes each weight matrix through orthogonal equivalence transformation, has been proposed. Although POET provides strong training stability, its original implementation incurs high memory...
Convergence Bound and Critical Batch Size of Muon Optimizer
arXiv:2507.01598v5 Announce Type: replace Abstract: Muon, a recently proposed optimizer that leverages the inherent matrix structure of neural network parameters, has demonstrated strong empirical performance, indicating its potential as a successor to standard optimizers such as AdamW. This paper presents theoretical analysis to support its practical success. We provide convergence proofs for Muon across four practical settings, systematically examining its behavior with and without the...
A Note on Stability for Orthogonalized Matrix Momentum with Client Sampling
Announce Type: new Abstract: We study finite-sample generalization for a client-sampled distributed optimization scheme with matrix-valued parameters and orthogonalized momentum updates. The central quantity is the gap between the population and empirical objectives at the returned model when only a subset of clients participates in each round. Under independent heterogeneous client data, unequal local sample counts, and fixed aggregation weights, we derive a finite-round upper-tail...
Latent Structural Categorical Matrix Completion with Application to Quasispecies Analysis
Announce Type: cross Abstract: Matrix completion has been extensively studied for real-valued data, but existing methods are often limited in handling categorical variables. We propose LCMC, a double-loop optimization framework for categorical matrix completion via latent factorization based on a binary tensor representation. In this setting, each categorical entry is encoded as a one-hot vector along a third tensor mode, thereby preserving its discrete, non-ordinal nature.
LiMuon: Light and Fast Muon Optimizer for Large Models
arXiv:2509.14562v4 Announce Type: replace Abstract: Large models recently are widely applied in machine learning, so efficient training of large models has received widespread attention. More recently, the useful Muon optimizer is specifically designed for matrix-structured parameters of large models. Although some works have begun to study the Muon optimizer, the existing Muon and its variants still suffer from high sample complexity or high memory for large models.
LiMuon: Light and Fast Muon Optimizer for Large Models
arXiv:2509.14562v3 Announce Type: replace Abstract: Large models recently are widely applied in machine learning, so efficient training of large models has received widespread attention. More recently, the useful Muon optimizer is specifically designed for matrix-structured parameters of large models. Although some works have begun to study the Muon optimizer, the existing Muon and its variants still suffer from high sample complexity or high memory for large models.
FalconGEMM: Surpassing Hardware Peaks with Lower-Complexity Matrix Multiplication
arXiv:2605.06057v3 Announce Type: replace Abstract: Peak breaking Matrix Multiplication is a promising technique to improve the performance of DL, especially in LLM training and inference. We present FalconGEMM, a cross-platform framework that automates the deployment, optimization, and selection of Lower-Complexity Matrix Multiplication Algorithms (LCMAs) across diverse hardware. There are three key innovations: (1) a Deployment Module that enables portable execution across various hardware...