Home › Knowledge Base › Benchmarking Codebase Conversion

Benchmarking Codebase Conversion

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Converted, Not Equivalent: Benchmarking Codebase Conversion via Observational Equivalence

Announce Type: replace Abstract: Coding agents increasingly act as codebase-scale collaborators that can assist with codebase conversion, but this progress has exposed a critical weakness: agents often over-trust their own local validation routines and declare success on artifacts that satisfy surface checks while violating the semantic contracts users actually care about. This problem is especially acute in codebase conversion, where prior evaluation is largely outcome-driven and therefore...

arXiv CS 5d ago

When AI Builds Itself: Our progress toward recursive self-improvement

For most of AI’s history, humans drove every step in its development cycle. But at Anthropic, we are delegating a growing share of AI development to AI systems themselves, which is speeding up our work. Taken far enough, and given enough compute, that trend points to an AI system capable of fully autonomously designing and developing its own successor.

Hacker News 6d ago

Anthropic/OpenAI may be spending more than $1000 for every $100 you pay them

For reasons that will remain hidden, we resume writing about Generative AI/LLM after a hiatus of 15 months (that one from October 2025, and the one from June 2025, don’t really count as serious pieces). Today, the first of two articles about “coding with Large ‘Language’ Models”, as coding with LLMs is positioned as the ‘killer app‘ for LLMs. We interrupt this program for a short digression on Anthropic’s recently released blog post When AI builds itself.

Hacker News 3d ago

Bun Has Been Converted to Rust. Now What?

On May 14, PR #30412 merged into Bun's main branch: a little over a million lines of Rust, 6,755 commits, generated almost entirely by Claude Code agents over nine days. Anthropic, which acquired Bun in December, supplied the agents. The Zig implementation that powered Bun is gone.

Hacker News 7d ago