Home Knowledge Base Frontier Developments

Frontier Developments

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Anthropic says AI labs need coordinated plan to halt development if risks rise

Anthropic says AI labs need coordinated plan to halt development if risks rise June 4 : Anthropic said on Thursday frontier AI developers should establish a coordinated, verifiable way to slow down or temporarily pause development if advanced systems begin improving themselves faster than society can manage the risks. AI that can build itself would be a major development in the history of technology, but "full recursive self-improvement also might increase the risks of humans losing control...

Channel News Asia 5d ago

If Claude Fable stops helping you, you'll never know

I didn't expect to read this in a model card. Fable 5 model card : we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.

Hacker News 14h ago

xAI is looking more like a datacentre REIT than a frontier lab

xAI is looking more like a datacentre REIT than a frontier lab An unexpected development over the past few weeks is xAI's new partnerships with Anthropic and Google, providing them with a huge amount of capacity. It's worth remembering that xAI is now part of SpaceX, after the two merged back in February - so the revenue from these deals flows straight into the entity about to go public. While much has been made of the potential financial engineering given SpaceX's upcoming IPO, I think...

Hacker News 1d ago

'It would be good for the world' to slow down AI sprints, Anthropic says

It would be “good for the world” to slow down the pace of AI development, according to a blog post from Anthropic, which this week began the process of going public with a confidential IPO filing. “We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology,” stated a blog post written by Anthropic co-founder (and former Reg scribe) Jack...

The Register 5d ago

MBABench: Evaluating LLM Agents on End-to-End Spreadsheet Tasks in Finance

arXiv:2605.22664v2 Announce Type: replace Abstract: LLM agents are increasingly expected to carry out end-to-end workflows, producing complete artifacts from high-level user instructions. To meet enterprise needs, frontier AI labs have developed agents that can construct entire spreadsheets from scratch. This is especially relevant in finance, where core workflows such as financial modeling, forecasting, and scenario analysis are commonly conducted through spreadsheets.

arXiv CS 1d ago

The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development?

Announce Type: new Abstract: Current AI benchmarks evaluate agents on task execution within human-designed workflows. These evaluations fundamentally fail to measure a critical next-level capability: whether models can autonomously develop agent systems. We introduce the Meta-Agent Challenge (MAC), an evaluation framework designed to test the capacity of frontier models for autonomous agent development.

arXiv CS 6d ago

Principled Synthetic Data Enables the First Scaling Laws for LLMs in Recommendation

arXiv:2602.07298v3 Announce Type: replace Abstract: Large Language Models (LLMs) represent a promising frontier for recommender systems, yet their development has been impeded by the absence of predictable scaling laws, which are crucial for guiding research and optimizing resource allocation. We hypothesize that this may be attributed to the inherent noise, bias, and incompleteness of raw user interaction data in prior continual pre-training (CPT) efforts. This paper introduces a novel,...

arXiv CS 8d ago

When AI Builds Itself: Our progress toward recursive self-improvement

For most of AI’s history, humans drove every step in its development cycle. But at Anthropic, we are delegating a growing share of AI development to AI systems themselves, which is speeding up our work. Taken far enough, and given enough compute, that trend points to an AI system capable of fully autonomously designing and developing its own successor.

Hacker News 5d ago

Xi’s last frontier: China’s plan to transform its west

A vast development drive aims to tap the region’s economic potential and extend Beijing’s control

Financial Times 9d ago

Cookie-Bench: Continuous On-screen Key Interaction Evaluation for Web Generation

arXiv:2605.30000v2 Announce Type: replace Abstract: Front-end web code has become a core product surface for every frontier LLM release, yet evaluating these interactive applications at development speed remains costly because human-judged leaderboards like Arena do not scale. Existing automated proxies typically lean on reference implementations, test suites, or rigid checklists, and tend to miss the reasoned synthesis a human reviewer performs over a live session. We articulate a new...

arXiv CS 8d ago