Home Knowledge Base Nonparametric LLM Evaluation

Nonparametric LLM Evaluation

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Nonparametric LLM Evaluation from Preference Data

arXiv:2601.21816v2 Announce Type: replace Abstract: Evaluating the performance of large language models (LLMs) from human preference data is crucial for obtaining LLM leaderboards. However, many existing approaches either rely on restrictive parametric assumptions or lack valid uncertainty quantification when flexible machine learning methods are used.

arXiv CS 1d ago

LLMSynthor: Macro-Aligned Micro-Records Synthesis with Large Language Models

arXiv:2505.14752v3 Announce Type: replace Abstract: Macro-aligned micro-records are crucial for credible simulations in social science and urban studies. For example, epidemic models are only reliable when individual-level mobility and contacts mirror real behavior, while aggregates match real-world statistics like case counts or travel flows. However, collecting such fine-grained data at scale is impractical, leaving researchers with only macro-level data.

arXiv CS 8d ago

LLMSynthor: Macro-Aligned Micro-Records Synthesis with Large Language Models

arXiv:2505.14752v4 Announce Type: replace Abstract: Macro-aligned micro-records are crucial for credible simulations in social science and urban studies. For example, epidemic models are only reliable when individual-level mobility and contacts mirror real behavior, while aggregates match real-world statistics like case counts or travel flows. However, collecting such fine-grained data at scale is impractical, leaving researchers with only macro-level data.

arXiv CS 1d ago