Adaptive Momentum
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
DP-MacAdam: Differentially Private Mechanism with Adaptive Clipping and Adaptive Momentum
new Abstract: Differentially private stochastic gradient descent (DP-SGD) has become the standard framework for privacy-preserving machine learning, yet its reliance on a fixed gradient clipping threshold to limit sensitivity remains a significant practical limitation. Adaptive clipping algorithms such as AdaClip shift and scale the gradient prior to clipping and adding noise so that the clipped gradient yields a more informative descent direction. The shift and scaling parameters are...
The Streets of Rage movie has new writers and a director
The Streets of Rage movie has new writers and a director Pat Casey and Josh Miller, the pair behind the Sonic the Hedgehog trilogy, are penning the script. It's been years since we've had an update on that upcoming Streets of Rage movie, but we finally have some forward momentum. The adaptation now has a director and new writers, according to Variety.
OptMuon: Closed-Loop Orthogonalized Momentum Methods for Stochastic Optimization with Zero-Noise Optimality
arXiv:2606.08783v1 Announce Type: cross Abstract: Orthogonalized momentum updates, as used in Muon-style optimizers, have recently shown strong empirical stability in large-scale deep learning. However, existing orthogonalized methods are typically paired with constant or open-loop magnitude rules, and therefore do not explicitly calibrate their update magnitudes from the observed optimization trajectory.
Consecutive Support Matching Induced Parameter Tuning Accelerates Momentum Iterative Hard Thresholding
Announce Type: new Abstract: Momentum-based acceleration of iterative hard thresholding (IHT) can dramatically speed up sparse signal recovery from linear measurements, but its effectiveness hinges on careful parameter tuning -- a task complicated by the frequent support changes inherent to hard thresholding. We propose CosMIHT(Consecutive Support Matching Induced Momentum IHT), which resolves this difficulty through a simple adaptive rule: start with the conservative parameters and whenever...
Preserving Full 6-DOF Actuation Under Abrupt Total Rotor Failures: Passive Fault-Tolerant Flight Control Using a Biaxial-Tilt Hexacopter
arXiv:2606.05663v1 Announce Type: new Abstract: Conventional multirotors suffer from a rapid collapse of attainable wrench space (AWS) under abrupt total rotor failures, rendering full 6-DOF recovery physically impossible. This paper addresses passive fault-tolerant flight of a biaxial-tilt overactuated hexacopter (BTO) under abrupt total rotor failures that are a priori unknown to the controller. The control design and analysis focus on representative abrupt rotor-failure cases for which...
Test-Time Training for Zero-Resource Dense Retrieval Reranking
Announce Type: new Abstract: Dense retrievers excel at first-stage candidate generation but lack effective reranking in zero-resource settings. Existing approaches face a fundamental dilemma: cross-encoders deliver strong reranking quality but require costly supervised training and incur high latency, while unsupervised BM25 reranking consistently degrades dense retrieval performance on most of BEIR benchmarks. We propose DART (Dense Adaptive Reranking at Test-time), which resolves this...
ZTE Day Indonesia 2026 strengthens AI innovation and digital infrastructure collaboration to accelerate Indonesia's digital transformation
ZTE held ZTE Day Indonesia 2026 in Jakarta to showcase its latest integrated ICT innovations aimed at accelerating Indonesia's digital transformation. The event highlighted how AI, intelligent networks, and cloud infrastructure are crucial for national digital competitiveness and future economic growth. ZTE emphasised the need for strong cross-industry collaboration to build a robust digital foundation for the country.
Stochastic convergence of parallel asynchronous adaptive first-order methods
arXiv:2606.01787v1 Announce Type: new Abstract: A new class of asynchronous adaptive first-order optimization methods is introduced, comprising asynchronous variants of several popular algorithms. Versions of these methods using momentum and/or inexact normalization are also considered. The convergence of methods in the class on non-convex functions is analyzed in a fully stochastic setting, and is shown to be (up to logarithmic factors) of order O(1/sqrt{t}) under reasonable assumptions.
OECD cuts 2026 global growth forecast and warns of recession risk if Iran war persists
The war in the Middle East has dented economic growth prospects worldwide, with a more severe shock likely if no effective ceasefire is agreed before 2027, the OECD warned Wednesday. The OECD has downgraded its global growth outlook, warning that rising energy prices, geopolitical tensions and persistent inflation are weighing on the world economy and could push several countries into recession if disruptions continue. In its quarterly update, the organisation, which represents 38...
Muon$^2$: Boosting Muon via Adaptive Second-Moment Preconditioning
arXiv:2604.09967v2 Announce Type: replace Abstract: Muon has emerged as a promising optimizer for large-scale foundation model pre-training by exploiting the matrix structure of neural network updates through iterative orthogonalization. However, the orthogonalization quality of Muon hinges on the number of Newton--Schulz (NS) iterations performed, which poses efficiency challenges due to its non-trivial computation and communication cost. We propose Muon$^2$, an extension of Muon, to...