Home › Knowledge Base › MLE-Bench

MLE-Bench

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

iML: Executable, Problem-Grounded, and Broadly Exploratory Code-Driven AutoML

Announce Type: replace Abstract: Automated Machine Learning (AutoML) has improved access to machine learning, yet existing techniques often remain limited in flexibility, transparency, and execution reliability. Code-driven AutoML offers a promising direction by synthesizing executable code for preprocessing, model training, and evaluation. However, current LLM-based approaches frequently generate code that is plausible in text yet brittle in execution, insufficiently grounded in the actual...

arXiv CS 8d ago

EvoMaster: A Foundational Evolving Agent Framework for Agentic Science at Scale

arXiv:2604.17406v3 Announce Type: replace Abstract: The convergence of large language models and agents is catalyzing a new era of scientific discovery: Agentic Science. While the scientific method is inherently iterative, existing agent frameworks are predominantly static, narrowly scoped, and lack the capacity to learn from trial and error. To bridge this gap, we present EvoMaster, a foundational evolving agent framework engineered specifically for Agentic Science at Scale.

arXiv CS 1d ago

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

arXiv:2606.06473v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly applied to long-horizon tasks such as scientific discovery and machine learning engineering (MLE), where sustained self-evolution becomes a key capability. However, existing MLE agents suffer from inter-branch information isolation, memoryless search, and lack of hierarchical control, which together hinder long-horizon optimization. We present MLEvolve, an LLM-based self-evolving multi-agent...

arXiv CS 5d ago