Home › Business & Finance › Capturing LLM Capabilities via Evidence-Calibrated Query...

Business & Finance

Capturing LLM Capabilities via Evidence-Calibrated Query Clustering

arXiv CS Tuesday 02 June 2026, 04:00 UTC By Fangzhou Wu, Sandeep Silwal, Qiuyi Zhang 1 min read

Key Points

arXiv:2605.17110v2 Announce Type: replace Abstract: Query clustering organizes queries into groups that reflect shared latent capability demands, enabling capability-aware LLM evaluation. Existing clustering methods, which primarily rely on semantic taxonomies or embeddings, often fail to capture such latent capability requirements due to a misalignment between surface-level semantics and actual model performance. We propose ECC, an algorithm that calibrates prior semantic embeddings using limited posterior model comparisons to bridge the gap between surface-level semantics and latent capability requirements. ECC characterizes each cluster through a capability profile parameterized by a Bradley-Terry model and uses trainable mixture weights to accommodate queries with mixed capability demands, jointly learning a flexible, capability-aware clustering structure that supports query-specific inference of LLM capabilities. Extensive quantitative and qualitative evaluations demonstrate that ECC significantly improves LLM capability ranking quality, outperforming human-labeled and embedding-based baselines by an average of 17.64 and 18.02 percentage points, respectively, and proves effective in downstream tasks such as query routing.

LLM (ORG) ECC (ORG) Bradley-Terry (ORG)

Originally published by arXiv CS Read original →

Capturing LLM Capabilities via Evidence-Calibrated Query Clustering

Related Stories

Why Starritt Wrote 'Drayton and Mackenzie'

USDA reverses course to allow pet dogs to travel from US to Mexico as it tries to slow screwworm spread

Starbucks stock is a bright spot in Wednesday's bleak market. Here's why

These in-demand jobs pay over $100,000 — and offer raises that keep ahead of inflation