Home Knowledge Base K-BrowseComp

K-BrowseComp

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

arXiv:2606.02404v1 Announce Type: new Abstract: Frontier model evaluations are shifting from foundational capabilities (e.g., instruction following and reasoning) toward compositional, agentic ones, but Korean agentic benchmarks remain scarce. We introduce K-BrowseComp, a web-browsing agent benchmark grounded in Korean contexts, consisting of 400 problems. The 300-problem K-BrowseComp-Verified subset is manually constructed and validated by native Korean speakers.

arXiv CS 8d ago