Home Knowledge Base DSR-Bench

DSR-Bench

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Can LLMs Reason Structurally? Benchmarking via the Lens of Data Structures

Announce Type: replace Abstract: Large language models (LLMs) are deployed on increasingly complex tasks that require multi-step decision-making. Understanding their algorithmic reasoning abilities is therefore crucial. However, we lack a diagnostic benchmark for evaluating these capabilities.

arXiv CS 8d ago