Modeling AdaGrad
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Modeling AdaGrad, RMSProp, and Adam with Integro-Differential Equations
Announce Type: replace Abstract: In this paper, we propose a continuous-time formulation for the AdaGrad, RMSProp, and Adam optimization algorithms by modeling them as first-order integro-differential equations. We perform numerical simulations of these equations, along with stability and convergence analyses, to demonstrate their validity as accurate approximations of the original algorithms. Our results indicate a strong agreement between the behavior of the continuous-time models and the...
Towards Simple and Provable Parameter-Free Adaptive Gradient Methods
Announce Type: replace Abstract: Optimization algorithms such as AdaGrad and Adam have significantly advanced the training of deep models by dynamically adjusting the learning rate during the optimization process. However, ad-hoc tuning of learning rates poses a challenge and leads to inefficiencies in practice. To address this issue, recent research has focused on developing ``parameter-free'' algorithms that operate effectively without the need for learning rate tuning.
Convergence of Steepest Descent and Adam under Non-Uniform Smoothness
arXiv:2605.30648v1 Announce Type: new Abstract: Recent work has analyzed the convergence of first-order methods under non-uniform smoothness assumptions that better model the loss landscape in machine learning tasks. We generalize this assumption to objectives whose curvature is an affine function of the objective value. This property is satisfied by a broad class of problems, including logistic regression, generalized linear models with a logistic link function, softmax policy gradient in...