Sharpness-Aware Minimization
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Adaptive Sharpness-Aware Minimization with a Polyak-type Step size: A Theory-Grounded Scheduler
arXiv:2606.01827v1 Announce Type: cross Abstract: Sharpness-Aware Minimization (SAM) has established itself as a powerful and widely adopted optimizer for training machine learning models. By explicitly minimizing the sharpness of the loss landscape, SAM often improves generalization while delivering strong empirical performance. However, SAM and its variants, like most training algorithms, are sensitive to the choice of learning rate, which is typically selected through extensive...
Stability Analysis of Sharpness-Aware Minimization
arXiv:2301.06308v2 Announce Type: replace Abstract: Sharpness-aware minimization (SAM) is a training method that seeks to find flat minima in deep learning, resulting in state-of-the-art performance across various domains. Instead of minimizing the loss of the current weights, SAM minimizes the worst-case loss in its neighborhood in the parameter space. In this paper, we investigate the convergence instability of SAM near a saddle point.
Sharpness-Aware Hybrid Model Learning for Architecture-Agnostic Parameter Estimation
arXiv:2602.06837v2 Announce Type: replace Abstract: Hybrid modeling, the combination of machine learning models and scientific mathematical models, enables flexible and robust data-driven prediction with partial interpretability. However, the unknown parameters of the scientific model cannot necessarily be estimated properly, since the flexibility of the machine learning model might make the scientific model part effectively ignored in prediction.
Inconsistency-Aware Minimization: Improving Generalization with Unlabeled Data
Announce Type: new Abstract: Estimating the generalization gap and developing optimization methods that improve generalization are crucial for deep learning models, for both theoretical understanding and practical applications. Leveraging unlabeled data for these purposes offers significant advantages in real-world scenarios. This paper introduces a novel generalization measure, local inconsistency, derived from an information-geometric perspective on the parameter space of neural networks.