Home Knowledge Base Sharpness-Aware Minimization

Sharpness-Aware Minimization

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Adaptive Sharpness-Aware Minimization with a Polyak-type Step size: A Theory-Grounded Scheduler

arXiv:2606.01827v1 Announce Type: cross Abstract: Sharpness-Aware Minimization (SAM) has established itself as a powerful and widely adopted optimizer for training machine learning models. By explicitly minimizing the sharpness of the loss landscape, SAM often improves generalization while delivering strong empirical performance. However, SAM and its variants, like most training algorithms, are sensitive to the choice of learning rate, which is typically selected through extensive...

arXiv CS 8d ago

Stability Analysis of Sharpness-Aware Minimization

arXiv:2301.06308v2 Announce Type: replace Abstract: Sharpness-aware minimization (SAM) is a training method that seeks to find flat minima in deep learning, resulting in state-of-the-art performance across various domains. Instead of minimizing the loss of the current weights, SAM minimizes the worst-case loss in its neighborhood in the parameter space. In this paper, we investigate the convergence instability of SAM near a saddle point.

arXiv CS 8d ago

Sharpness-Aware Hybrid Model Learning for Architecture-Agnostic Parameter Estimation

arXiv:2602.06837v2 Announce Type: replace Abstract: Hybrid modeling, the combination of machine learning models and scientific mathematical models, enables flexible and robust data-driven prediction with partial interpretability. However, the unknown parameters of the scientific model cannot necessarily be estimated properly, since the flexibility of the machine learning model might make the scientific model part effectively ignored in prediction.

arXiv CS 8d ago

Inconsistency-Aware Minimization: Improving Generalization with Unlabeled Data

Announce Type: new Abstract: Estimating the generalization gap and developing optimization methods that improve generalization are crucial for deep learning models, for both theoretical understanding and practical applications. Leveraging unlabeled data for these purposes offers significant advantages in real-world scenarios. This paper introduces a novel generalization measure, local inconsistency, derived from an information-geometric perspective on the parameter space of neural networks.

arXiv CS 9d ago