Home Knowledge Base the Human-Centered Benchmarking Framework

the Human-Centered Benchmarking Framework

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Human-Centered Benchmarking of Driver Monitoring Models

arXiv:2606.08123v1 Announce Type: new Abstract: Vision-based driver monitoring systems are increasingly deployed in safety-critical intelligent transportation settings, yet they are almost always compared on classification accuracy alone. This paper argues that accuracy is insufficient to characterize a model's fitness for real-world deployment, and proposes the Human-Centered Benchmarking Framework (HCBF), which evaluates models across four dimensions: accuracy, explainability, efficiency,...

arXiv CS 1d ago

Toward Human-Centered Multi-Agent Systems: Integrating Cognition, Culture, Values, and Cooperation in AI Agents

Announce Type: new Abstract: The emergence of large language model (LLM)-based agents and multi-agent systems has enabled a shift from narrow task automation to more autonomous decision-making. Despite progress in language generation, planning, tool use, and coordination, most agents still treat intelligence as prediction, optimization, and task completion. Human environments are social and normative, where people reason under bounded rationality, communicate in culturally situated language,...

arXiv CS 1d ago

MindClaw: Closed-Loop Embodied Mental-State Reasoning for Precision Intervention

arXiv:2606.01063v1 Announce Type: new Abstract: Theory of Mind (ToM) enables an agent to reason about another actor's beliefs, goals, and intentions, which is essential for human-centered embodied assistance. Existing ToM benchmarks have advanced text and multimodal mental-state recognition, but they mostly evaluate offline question answering or final action prediction. They do not fully test whether an embodied agent can stay connected to a changing environment, update actor-specific...

arXiv CS 8d ago

SHALA-LLM: Smartly Handling Ambiguous Labels in Aligning LLMs

arXiv:2606.05376v1 Announce Type: new Abstract: Many human-centered tasks, including natural language inference (NLI) and emotion recognition (ER), have multiple plausible interpretations, leading to label ambiguity and challenging disagreements across human annotators. As LLMs are increasingly deployed in real-world settings, faithfully modeling such ambiguity is essential to identify contested inputs, preserve variability in ambiguous cases, and capture the full distribution of human...

arXiv CS 5d ago