Chinchilla
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Gas industry boom brings uneven growth to Western Downs towns
Gas industry boom brings uneven growth to Western Downs towns Sat 6 Jun 2026 at 8:59am In Chinchilla on Queensland's Western Downs, Lyn McCullough throws open the doors of her bakery at 4am. It is still dark and, at this time of year, cold. But within minutes, a line of high-vis-clad workers files in, stocking up before heading out to nearby gas fields.
Human-Like Neural Nets by Catapulting
Human-like Neural Nets by Catapulting Speculative proposal to create artificial neural nets with human-like performance by high-learning-rate/regularization training of overparameterized NNs to trigger catapulting/grokking. Over-parameterization as a route to true generalization would resolve many outstanding mysteries of artificial versus natural intelligence. There are many mysteries about deep learning and human intelligence, but we could describe the biggest anomaly this way: why are...
Scaling Laws for Behavioral Foundation Models over User Event Sequences
arXiv:2606.05257v1 Announce Type: new Abstract: Foundation models are increasingly trained on sequences of user actions in recommendation, payments, fraud, and commerce, but these models still lack the kind of compute calibration that scaling laws provide for language models. We study a common two-part behavioral-model architecture: a feature-based event embedder maps each multi-modal item to a vector, and a decoder-only transformer predicts the next event from the resulting sequence. Across...
Data-Constrained Language Model Pretraining: Improved Regularization and Scaling Laws
Announce Type: new Abstract: Classical scaling laws for language model pretraining balance model size against training dataset size under a fixed compute budget, assuming abundant data and a single pass over the corpus. As training compute grows faster than the supply of natural language data, pretraining is likely to enter a data-constrained, compute-rich regime where models train for multiple epochs over a finite dataset. We study data-constrained pretraining along two axes, regularization...
Explaining Data Mixing Scaling Laws
arXiv:2606.08167v1 Announce Type: new Abstract: Recent research has established empirical scaling laws to predict model performance on multi-domain data mixtures. However, a theoretical understanding of these model loss behaviors remains absent. In this work, we propose a unified framework to explain the underlying mechanics of data mixing.