the Most of Limited Data
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Making the Most of Limited Data: Score-Aware Training for Text-to-Music Generation
Announce Type: new Abstract: State-of-the-art text-to-music generation systems rely on massive proprietary datasets and industrial-scale compute, making it impossible to disentangle architectural contributions from resource advantages. We propose \textit{score-aware training}, which treats audio-caption alignment score as a direct supervision signal throughout the pipeline. Rather than discarding low-scoring segments, we repurpose them via a CLAP-conditioned Beta noise timestep schedule that...
Synthetic Benchmarks Overstate Forward-Forward Scaling: Real-Data Limits of Layer-Local Training
new Abstract: Forward-Forward (FF) learning [Hinton, 2022] replaces backpropagation with strictly layer-local goodness updates. Recent FF-CNN work has narrowed the gap to BP on 32x32 benchmarks, raising the question of whether layer-local training is becoming a viable alternative at realistic scale. To probe this rigorously, we develop DTG-FF -- dynamic temperature goodness, decoupled normalization, and multi-layer fusion -- as an instrument that sets FF-family state of the art across nine...
Amazon Employees Show Up to City Council Meeting to Demand Limits on Data Centers
Two Amazon employees on Wednesday publicly called for regulations on new data centers, telling elected officials in Seattle that unchecked development of the sharply disputed nerve centers of AI threatens the region’s environment, economy, and safety. “Local governments, in collaboration with community stakeholders, should be setting the terms for data center buildout,” Amazon senior software engineer Liesl Wigand said at a city hearing. “Let’s not let big tech burn Seattle to win the AI race.”
Trust-Aware Predictive Emissions Monitoring for Gas Turbine Fleets with Limited Labelled Data
arXiv:2606.06156v1 Announce Type: new Abstract: Machine learning-based predictive emissions monitoring systems offer a practical alternative to direct emissions measurement, but their deployment across gas turbine fleets is challenging when emissions labels are available for only a small subset of assets. In this work, a trust-aware probabilistic framework is proposed for fleet-level gas turbine NOx prediction under limited labelled supervision. The framework combines a multi-head recurrent...
Which Anatomy Matters Under Limited Labels? A Data-Efficient Anatomy-Aware Benchmark for Cardiac Pathology Prediction
Announce Type: cross Abstract: Numerous medical imaging problems must be solved under limited labels and constrained compute, yet it remains unclear whether performance gains are driven mainly by more expressive models or by better representation of clinically meaningful anatomy. We study this question through a low-data anatomy-aware benchmark for 5-class cardiac pathology prediction on the public ACDC MRI dataset.
When More Data Doesn't Help: Limits of Adaptation in Multitask Learning
Announce Type: replace Abstract: Multitask learning and related frameworks have achieved tremendous success in modern applications. In multitask learning problem, we are given a set of heterogeneous datasets collected from related source tasks and hope to enhance the performance above what we could hope to achieve by solving each of them individually. The recent work of arXiv:2006.15785 has showed that, without access to distributional information, no algorithm based on aggregating samples...
Predicting Dynamic Map States from Limited Field-of-View Sensor Data
arXiv:2602.12360v2 Announce Type: replace Abstract: When autonomous systems are deployed in real-world scenarios, sensors are often subject to limited field-of-view (FOV) constraints, either naturally through system design, or through unexpected occlusions or sensor failures. In conditions where a large FOV is unavailable, it is important to be able to infer information about the environment and predict the state of nearby surroundings based on available data to maintain safe and accurate...
When Data Is Scarce: Scaling Sparse Language Models with Repeated Training
Announce Type: new Abstract: Scaling laws for dense LLMs under infinite data are well explored, but how sparsity interacts with limited data is not. In this work, we study sparse training in data-constrained regimes where limited unique tokens require multi-epoch training. Our experiments span models up to 1.92B parameters in the fitting set, sparsity up to 93.75%, unique data budgets up to 2.6B tokens, and total training tokens up to 41.6B over 16 epochs; we further validate extrapolation...
Microsoft limits employee use of Anthropic's Claude Fable 5 over data retention concerns, The Verge reports
Microsoft limits employee use of Anthropic's Claude Fable 5 over data retention concerns, The Verge reports June 10 : Microsoft is limiting employees' use of Anthropic's Claude Fable 5 because of the AI startup's new data retention requirements, The Verge reported on Wednesday, citing sources. Anthropic on Tuesday said it is rolling out Claude Fable 5, a public version of its Mythos AI model, with guardrails barring its use in risky areas such as cybersecurity.
DAD4TS: Data-Augmentation-Oriented Diffusion Model for Time-Series Forecasting with Small-Scale Data
arXiv:2605.17866v2 Announce Type: replace Abstract: Small-scale data is a critical problem in time-series forecasting tasks. Data augmentation is an effective strategy for this task, but it has a limitation in generating meaningful data. To address this limitation, we propose DAD4TS, a diffusion-model-based data augmentation method with reinforcement learning, designed for time-series forecasting with small-scale data.