Home Knowledge Base Bradley-Terry-Davidson

Bradley-Terry-Davidson

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Distribution-Calibrated Inference Time Compute for Thinking LLM-as-a-Judge

arXiv:2512.03019v2 Announce Type: replace Abstract: Thinking Large Language Models (LLMs) used as judges for pairwise preferences remain noisy at the single-sample level, and common aggregation rules (majority vote, soft self-consistency, or instruction-based self-aggregation) are inconsistent when ties are allowed. We study inference-time compute (ITC) for evaluators that generate n independent thinking--rating samples per item, and propose a principled, distribution-calibrated aggregation...

arXiv CS 7d ago