Home Knowledge Base Scaling Data and Structure

Scaling Data and Structure

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

new Abstract: We introduce Humanoid-GPT, a GPT-style Transformer with causal attention trained on a billion-scale motion corpus for whole-body control. Unlike prior shallow MLP trackers constrained by scarce data and an agility-generalization trade-off, Humanoid-GPT is pre-trained on a 2B-frame retargeted corpus that unifies all major mocap datasets with large-scale in-house recordings. Scaling both data and model capacity yields a single generative Transformer that tracks highly dynamic...

arXiv CS 7d ago

An interpretable and trustworthy AI framework for large-scale longitudinal structure-pain association studies using data from the Osteoarthritis Initiative (OAI)

Announce Type: new Abstract : Purpose: To develop an interpretable and trustworthy AI framework that combines deep learning based MRI Osteoarthritis Knee Score (MOAKS) prediction with interpretable statistical modeling to study structure-pain relationships at scale using data from the Osteoarthritis Initiative (OAI). Materials and Methods: We first developed a deep learning framework to predict MOAKS features directly from knee MRIs and incorporated conformal prediction to provide prediction...

arXiv CS 5d ago

An Empirical Study of Data Scale, Model Complexity, and Input Modalities in Visual Generalization

arXiv:2606.04409v1 Announce Type: new Abstract: Modern deep neural networks usually have large parameter scales and nonlinear hierarchical structures, and they have achieved strong performance in computer vision. However, the source of their generalization performance remains difficult to explain using traditional statistical learning theory. Among the factors that may affect visual generalization, data scale, model complexity, and input modalities are fundamental and controllable variables.

arXiv CS 6d ago

An Empirical Study of Data Scale, Model Complexity, and Input Modalities in Visual Generalization

arXiv:2606.04409v2 Announce Type: replace Abstract: Modern deep neural networks usually have large parameter scales and nonlinear hierarchical structures, and they have achieved strong performance in computer vision. However, the source of their generalization performance remains difficult to explain using traditional statistical learning theory. Among the factors that may affect visual generalization, data scale, model complexity, and input modalities are fundamental and controllable variables.

arXiv CS 1d ago

Data-Driven Spectral Prediction for Accelerating Large-Scale Electronic Structure Calculations

arXiv:2606.00401v1 Announce Type: cross Abstract: Simulating large molecular systems comprising thousands of atoms requires highly scalable methodologies. While modern Density Functional Theory (DFT) codes exhibit linear scaling, solving the associated large, sparse generalized eigenproblems remains a critical computational bottleneck on exascale architectures. In the context of the LimitX project, we propose a data-driven framework to accelerate these calculations.

arXiv CS 8d ago

Data-Driven Spectral Prediction for Accelerating Large-Scale Electronic Structure Calculations

arXiv:2606.00401v1 Announce Type: new Abstract: Simulating large molecular systems comprising thousands of atoms requires highly scalable methodologies. While modern Density Functional Theory (DFT) codes exhibit linear scaling, solving the associated large, sparse generalized eigenproblems remains a critical computational bottleneck on exascale architectures. In the context of the LimitX project, we propose a data-driven framework to accelerate these calculations.

arXiv Physics 8d ago

Rivaling Transformers: Multi-Scale Structured State-Space Mixtures for Agentic 6G O-RAN

arXiv:2510.05255v2 Announce Type: replace Abstract: In sixth-generation (6G) Open Radio Access Networks (O-RAN), proactive control is preferable. A key open challenge is delivering control-grade predictions within Near-Real-Time (Near-RT) latency and computational constraints under multi-timescale dynamics. We therefore cast RAN Intelligent Controller (RIC) analytics as an agentic perceive-predict xApp that turns noisy, multivariate RAN telemetry into short-horizon per-User Equipment (UE)...

arXiv CS 1d ago

From Scaling to Structured Expressivity: Rethinking Transformers for CTR Prediction

arXiv:2511.12081v2 Announce Type: replace Abstract: Despite massive investments in scale, deep models for click-through rate (CTR) prediction often exhibit rapidly diminishing returns -- a stark contrast to the {predictable scaling laws} seen in large language models (LLMs). We identify the root cause as a {fundamental} \textit{structural misalignment}: {standard} Transformers assume sequential compositionality, whereas CTR data demand combinatorial reasoning over {heterogeneous} fields. To...

arXiv CS 8d ago

Scaling Multi-Hop Training Data via Graph-Constrained Path Selection

arXiv:2605.31238v1 Announce Type: new Abstract: Endowing large language models with compositional reasoning over specialized documents requires multi-hop training data at scale, where such data rarely exists outside of curated benchmarks built on structured sources. To construct it directly from plain, unannotated text, existing methods ask a single teacher model to jointly discover an evidence path through a document and verbalize it as a question-answer pair. However, these methods degrade...

arXiv CS 9d ago

Finding the Minimal Parameter Budget for Implicit Reasoning: A Data Complexity Driven Scaling Law for Language Models

arXiv:2504.03635v5 Announce Type: replace Abstract: Reasoning is a core capability of language models (LMs), yet it remains unclear how much model capacity is necessary to support reasoning during pretraining. In this work, we study the minimal parameter budget required for implicit reasoning, defined as the ability to infer new facts from learned knowledge without explicit chain-of-thought supervision. To isolate this phenomenon, we pretrain LMs from scratch in a controlled synthetic...

arXiv CS 2d ago