Home Knowledge Base ForeSci

ForeSci

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

ForeSci: Evaluating LLM Agents for Forward-Looking AI Research Judgment

Announce Type: replace Abstract: AI research often requires decisions before future evidence exists: which bottleneck to attack, which direction to pursue, or where a project should be positioned. We introduce ForeSci, a temporally controlled benchmark for evaluating whether LLM agents can make such forward-looking research judgements from historical evidence. ForeSci contains 500 tasks across four fast-moving AI domains and four decision families.

arXiv CS 5d ago