Home › Knowledge Base › Stochastic Collapse

Stochastic Collapse

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Evaluating Stochastic Collapse and Implicit Bias in Multimodal Large Language Models

arXiv:2606.05874v1 Announce Type: new Abstract: Current evaluations for Multimodal Large Language Models (MLLMs) overwhelmingly focus on utility-driven objectives, leaving model behavior under logic-neutral scenarios largely underexplored. Stochasticity is essential in scenarios where multiple actions are equally valid, such as recommending travel itineraries or daily schedules where multiple options have similar utility. In such settings, deterministic policies may lead to repetitive...

arXiv CS 5d ago

Position: the Stochastic Parrot in the Coal Mine. Model Collapse is a Threat to Low-Resource Communities

arXiv:2605.04127v2 Announce Type: replace Abstract: Model collapse, the degradation in performance that arises when generative models are trained on the outputs of prior models, is an increasing concern as artificially generated content proliferates. Related critiques of large language models have highlighted their tendency to reproduce frequent patterns in training data, their reliance on vast datasets, and their substantial environmental cost. Together, these factors contribute to data...

arXiv CS 8d ago

Martingale Neural Operators: Learning Stochastic Marginals via Doob-Meyer Factorization

arXiv:2605.15806v2 Announce Type: replace Abstract: Neural operators excel as deterministic surrogates, but inevitably collapse to the conditional mean when applied to stochastic PDEs, discarding the variance and tail structure upon which uncertainty quantification depends. Recovering this structure typically requires Monte Carlo rollouts or grafted generative models, both of which surrender the one-shot efficiency and resolution invariance that define the operator paradigm. To resolve this,...

arXiv CS 7d ago

Zero Collapse: A Failure Mode of Policy Gradient Methods in Discontinuous Reward Environments

Announce Type: new Abstract: Bidding in repeated auctions is a central challenge for reinforcement learning (RL), combining continuous control with the strategic complexities of digital advertising. While policy gradient and value-based methods seem well-suited for these settings, they often struggle with the discontinuous, "cliff-like" nature of auction reward landscapes. In a first-price auction, for example, a bidder receives zero reward until they cross a specific threshold, after which...

arXiv CS 9d ago

Uncertainty-Aware End-to-End Co-Design of Neural Network Processors: From Training and Mapping to Fabrication

arXiv:2606.04850v1 Announce Type: new Abstract: Designing a neural network processor is an end-to-end co-design problem: network architecture and training budget determine the inference workload; hardware mapping decisions determine chip area, latency, and energy; and these characteristics govern fabrication yield and manufacturing cost. In practice, these decisions are made in separate stages, and existing co-design methodologies are tightly coupled to specific algorithms, making it...

arXiv CS 6d ago

OptMuon: Closed-Loop Orthogonalized Momentum Methods for Stochastic Optimization with Zero-Noise Optimality

arXiv:2606.08783v1 Announce Type: cross Abstract: Orthogonalized momentum updates, as used in Muon-style optimizers, have recently shown strong empirical stability in large-scale deep learning. However, existing orthogonalized methods are typically paired with constant or open-loop magnitude rules, and therefore do not explicitly calibrate their update magnitudes from the observed optimization trajectory.

arXiv CS 1d ago

A Goal-Set Characterization of Task Composition in the Boolean Task Algebra

arXiv:2606.04053v1 Announce Type: new Abstract: The Boolean Task Algebra (BTA) provides a principled framework for zero-shot task composition in reinforcement learning by equipping goal-reaching tasks with Boolean operations. We revisit its structural assumptions and formalize a collapse in the space of optimal extended Q-value functions: in deterministic MDPs, every such function is fully determined by the universal and empty tasks. This makes the logarithmic set of base tasks proposed in...

arXiv CS 6d ago

Learning Multi-Modal Trajectory Policies for Data-Efficient Robotic Manipulation

arXiv:2606.01047v1 Announce Type: new Abstract: Robotic manipulation requires the effective integration of heterogeneous inputs, including visual observations, language instructions, and trajectory representations, to generate accurate actions. Existing transformer-based policies typically process these heterogeneous modalities within a shared parameter space, which often leads to modality interference and inefficient representation learning, especially in data-scarce scenarios. While...

arXiv CS 8d ago

Beyond the Frontier: Stochastic Backtracking for Efficient Test-Time Scaling

arXiv:2605.25143v2 Announce Type: replace Abstract: Test-time scaling improves language model reasoning by spending additional compute to explore multiple solution trajectories. The key challenge is to maximize accuracy while minimizing the total number of generated tokens during reasoning. Recent PRM-guided methods score intermediate prefixes to steer this search, but most are frontier-only: they keep only the current active prefixes and irreversibly prune or resample away the rest using...

arXiv CS 8d ago

The LLM warnings Google fired Timnit Gebru over have all come true

"Timnit Gebru was fired from Google in December 2020 for refusing to retract a research paper, and every single warning that paper made about large language models has now happened at a scale the industry spent 4 years trying to make people forget about. Her name is Timnit Gebru. She co-led the Ethical AI team at Google.

Hacker News 6d ago