DSR-Bench
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Can LLMs Reason Structurally? Benchmarking via the Lens of Data Structures
Announce Type: replace Abstract: Large language models (LLMs) are deployed on increasingly complex tasks that require multi-step decision-making. Understanding their algorithmic reasoning abilities is therefore crucial. However, we lack a diagnostic benchmark for evaluating these capabilities.