Home › Knowledge Base › the Generalization Gap

the Generalization Gap

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

On the Generalization Gap in Self-Evolving Language Model Reasoning

arXiv:2606.01075v2 Announce Type: new Abstract: Recent work suggests that large language models (LLMs) can improve through self-evolution (SE), using supervision signals generated by the model itself. In this work, we ask: under a strict closed-loop setup, where the self-evolution algorithm has access only to an unlabeled prompt set and a base model, how close can internally generated supervision come to oracle-supervised training?

arXiv CS 7d ago

On the Generalization Gap in Self-Evolving Language Model Reasoning

new Abstract: Recent work suggests that large language models (LLMs) can improve through self-evolution (SE), using supervision signals generated by the model itself. In this work, we ask: under a strict closed-loop setup, where the self-evolution algorithm has access only to an unlabeled prompt set and a base model, how close can internally generated supervision come to oracle-supervised training? We analyze four representative strategies in a unified offline self-evolution framework:...

arXiv CS 8d ago

Inconsistency-Aware Minimization: Improving Generalization with Unlabeled Data

Announce Type: new Abstract: Estimating the generalization gap and developing optimization methods that improve generalization are crucial for deep learning models, for both theoretical understanding and practical applications. Leveraging unlabeled data for these purposes offers significant advantages in real-world scenarios. This paper introduces a novel generalization measure, local inconsistency, derived from an information-geometric perspective on the parameter space of neural networks.

arXiv CS 9d ago

Brief Announcement: Generative Markov Model for Distributed Computing Systems

Announce Type: new Abstract: Emerging distributed computing paradigms, such as the computing continuum, are inherently heterogeneous, stochastic, and complex. Efficiently and effectively utilizing all available resources across the continuum demands a unified formal model of the system. To address this gap, we propose a general framework for modeling distributed computing systems as a generative Markov model, factorized over a structured system state.

arXiv CS 7d ago

Unicorn: Scaling High-Dimensional Time Series Forecasting via Universal Correlation Modeling

Announce Type: new Abstract: Modern time series architectures face a fundamental trade-off: channel-independent models scale well with increasing data volume but ignore critical inter-channel dependencies, while channel-dependent models are expressive but remain ``dimension-bounded'', struggling to generalize across heterogeneous datasets. To bridge this gap, we introduce Unicorn (Universal Correlation Network), a framework for scalable, multi-dataset pretraining on high-dimensional time...

arXiv CS 9d ago

OneVLA: A Unified Framework for Embodied Tasks

Announce Type: replace Abstract: Navigation and manipulation are fundamental capabilities of embodied intelligence, enabling robots to interpret natural language commands and interact physically with their surroundings. However, current Vision-Language-Action (VLA) models remain constrained by task-specific architectures, specializing in either navigation or manipulation, which hinders the development of general-purpose robotic agents. To bridge this gap, we introduce OneVLA, a unified...

arXiv CS 7d ago

Degradation-Aware Metric Prompting for Hyperspectral Image Restoration

Announce Type: replace Abstract: Unified hyperspectral image (HSI) restoration aims to recover diverse degradations within a single model. However, current methods often rely on impractical explicit priors or opaque black-box representations that overfit to training distributions, hampering generalization to unseen scenarios. To bridge this gap, we propose Degradation-Aware Metric Prompting (DAMP), a novel framework that characterizes multi-dimensional degradations through interpretable...

arXiv CS 8d ago

OneVLA: A Unified Framework for Embodied Tasks

arXiv:2606.01241v1 Announce Type: new Abstract: Navigation and manipulation are fundamental capabilities of embodied intelligence, enabling robots to interpret natural language commands and interact physically with their surroundings. However, current Vision-Language-Action (VLA) models remain constrained by task-specific architectures, specializing in either navigation or manipulation, which hinders the development of general-purpose robotic agents. To bridge this gap, we introduce OneVLA,...

arXiv CS 8d ago

Fast Generalization after Interpolation via Critically Damped Momentum Optimization

arXiv:2606.01521v1 Announce Type: new Abstract: A central problem in machine learning is that models can achieve near-perfect training performance while generalizing substantially less well to unseen examples. This gap is especially acute in high-dimensional, low-sample regimes, where many interpolating solutions exist and optimization must implicitly select among minima with different generalization properties. Following recent theoretical advances on optimization dynamics near the...

arXiv CS 8d ago

Learning Discriminative and Generalizable Anomaly Detector for Dynamic Graph with Limited Supervision

Announce Type: replace Abstract: Dynamic graph anomaly detection is critical for many real-world applications but remains challenging due to the scarcity of labeled anomalies. Existing methods are either unsupervised or semi-supervised: unsupervised methods avoid the need for labeled anomalies but often produce ambiguous boundary, whereas semi-supervised methods can overfit to the limited labeled anomalies and generalize poorly to unseen anomalies. To address this gap, we consider a largely...

arXiv CS 8d ago