Home › Knowledge Base › the Most of Limited Data

the Most of Limited Data

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Making the Most of Limited Data: Score-Aware Training for Text-to-Music Generation

Announce Type: new Abstract: State-of-the-art text-to-music generation systems rely on massive proprietary datasets and industrial-scale compute, making it impossible to disentangle architectural contributions from resource advantages. We propose \textit{score-aware training}, which treats audio-caption alignment score as a direct supervision signal throughout the pipeline. Rather than discarding low-scoring segments, we repurpose them via a CLAP-conditioned Beta noise timestep schedule that...

arXiv CS 2d ago

Synthetic Benchmarks Overstate Forward-Forward Scaling: Real-Data Limits of Layer-Local Training

new Abstract: Forward-Forward (FF) learning [Hinton, 2022] replaces backpropagation with strictly layer-local goodness updates. Recent FF-CNN work has narrowed the gap to BP on 32x32 benchmarks, raising the question of whether layer-local training is becoming a viable alternative at realistic scale. To probe this rigorously, we develop DTG-FF -- dynamic temperature goodness, decoupled normalization, and multi-layer fusion -- as an instrument that sets FF-family state of the art across nine...

arXiv CS 2d ago

Amazon Employees Show Up to City Council Meeting to Demand Limits on Data Centers

Two Amazon employees on Wednesday publicly called for regulations on new data centers, telling elected officials in Seattle that unchecked development of the sharply disputed nerve centers of AI threatens the region’s environment, economy, and safety. “Local governments, in collaboration with community stakeholders, should be setting the terms for data center buildout,” Amazon senior software engineer Liesl Wigand said at a city hearing. “Let’s not let big tech burn Seattle to win the AI race.”

Wired 7d ago

Trust-Aware Predictive Emissions Monitoring for Gas Turbine Fleets with Limited Labelled Data

arXiv:2606.06156v1 Announce Type: new Abstract: Machine learning-based predictive emissions monitoring systems offer a practical alternative to direct emissions measurement, but their deployment across gas turbine fleets is challenging when emissions labels are available for only a small subset of assets. In this work, a trust-aware probabilistic framework is proposed for fleet-level gas turbine NOx prediction under limited labelled supervision. The framework combines a multi-head recurrent...

arXiv CS 5d ago

Which Anatomy Matters Under Limited Labels? A Data-Efficient Anatomy-Aware Benchmark for Cardiac Pathology Prediction

Announce Type: cross Abstract: Numerous medical imaging problems must be solved under limited labels and constrained compute, yet it remains unclear whether performance gains are driven mainly by more expressive models or by better representation of clinically meaningful anatomy. We study this question through a low-data anatomy-aware benchmark for 5-class cardiac pathology prediction on the public ACDC MRI dataset.

arXiv CS 2d ago

When More Data Doesn't Help: Limits of Adaptation in Multitask Learning

Announce Type: replace Abstract: Multitask learning and related frameworks have achieved tremendous success in modern applications. In multitask learning problem, we are given a set of heterogeneous datasets collected from related source tasks and hope to enhance the performance above what we could hope to achieve by solving each of them individually. The recent work of arXiv:2006.15785 has showed that, without access to distributional information, no algorithm based on aggregating samples...

arXiv CS 9d ago

Predicting Dynamic Map States from Limited Field-of-View Sensor Data

arXiv:2602.12360v2 Announce Type: replace Abstract: When autonomous systems are deployed in real-world scenarios, sensors are often subject to limited field-of-view (FOV) constraints, either naturally through system design, or through unexpected occlusions or sensor failures. In conditions where a large FOV is unavailable, it is important to be able to infer information about the environment and predict the state of nearby surroundings based on available data to maintain safe and accurate...

arXiv CS 2d ago

When Data Is Scarce: Scaling Sparse Language Models with Repeated Training

Announce Type: new Abstract: Scaling laws for dense LLMs under infinite data are well explored, but how sparsity interacts with limited data is not. In this work, we study sparse training in data-constrained regimes where limited unique tokens require multi-epoch training. Our experiments span models up to 1.92B parameters in the fitting set, sparsity up to 93.75%, unique data budgets up to 2.6B tokens, and total training tokens up to 41.6B over 16 epochs; we further validate extrapolation...

arXiv CS 8d ago

Microsoft limits employee use of Anthropic's Claude Fable 5 over data retention concerns, The Verge reports

Microsoft limits employee use of Anthropic's Claude Fable 5 over data retention concerns, The Verge reports June 10 : Microsoft is limiting employees' use of Anthropic's Claude Fable 5 because of the AI startup's new data retention requirements, The Verge reported on Wednesday, citing sources. Anthropic on Tuesday said it is rolling out Claude Fable 5, a public version of its Mythos AI model, with guardrails barring its use in risky areas such as cybersecurity.

Channel News Asia 6h ago

DAD4TS: Data-Augmentation-Oriented Diffusion Model for Time-Series Forecasting with Small-Scale Data

arXiv:2605.17866v2 Announce Type: replace Abstract: Small-scale data is a critical problem in time-series forecasting tasks. Data augmentation is an effective strategy for this task, but it has a limitation in generating meaningful data. To address this limitation, we propose DAD4TS, a diffusion-model-based data augmentation method with reinforcement learning, designed for time-series forecasting with small-scale data.

arXiv CS 7d ago