Multiple Annotators
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Annot-Mix: Learning with Noisy Class Labels from Multiple Annotators via a Mixup Extension
arXiv:2405.03386v2 Announce Type: replace Abstract: Training with noisy class labels impairs neural networks' generalization performance. In this context, mixup is a popular regularization technique to improve training robustness by making memorizing false class labels more difficult. However, mixup neglects that multiple annotators, e.g., crowdworkers, typically provide class labels.
CellClick: an interactive platform for adjustable and accurate cell type annotation in single-cell and spatial omics data
Single-cell omics and spatial omics technologies are nowadays widely used in biological and medical research. In both single-cell and spatial omics data analysis, accurate cell type annotation is a key step for downstream analysis and scientific discoveries. However, high-quality cell annotation usually requires multiple rounds of manual analysis for result refinement, which poses great challenges to most researchers.
From Self to Other: Evaluating Demographic Perspective-Taking in LLM Hate Speech Annotation
arXiv:2606.06266v1 Announce Type: new Abstract: Hate speech detection is inherently subjective: people from different demographic groups perceive the same content very differently. Collecting enough annotations from multiple demographic groups is costly and difficult to scale. Persona-conditioned Large Language Models (models prompted to adopt a specific demographic identity) have been proposed as a way to simulate diverse perspectives at scale.
Sparse Mixture-of-Experts Reward Models Learn Interpretable and Specialized Experts for Personalized Preference Modeling
Announce Type: new Abstract: Preference modeling plays a central role in reinforcement learning from human feedback (RLHF), enabling large language models (LLMs) to align with human values. However, most existing approaches assume a universal reward function, neglecting the diversity and heterogeneity of human preferences. To address this limitation without additional annotation costs, recent work has proposed learning multiple preference components from binary data and combining them to...
STABLEVAL: Disagreement-Aware and Stable Evaluation of AI Systems
arXiv:2605.02122v2 Announce Type: replace Abstract: Human evaluation remains the primary standard for assessing modern AI systems, yet annotator disagreement, bias, and variability make system rankings fragile under standard majority vote aggregation. Majority vote discards annotator reliability and item-level ambiguity, often yielding unstable comparisons across annotator subsets. We introduce STABLEVAL, a disagreement-aware evaluation framework that models latent item correctness and...
Physical Plausibility Reasoning via HCM-GRPO: Empowering Compact Model for Superior Performance
arXiv:2511.10055v2 Announce Type: replace Abstract: The performance of image generation has been significantly improved in recent years. However, the study of image screening is rare, and its performance with Multimodal Large Language Models (MLLMs) is unsatisfactory due to the lack of data and the weak physical plausibility reasoning ability in MLLMs. In this work, we propose a complete solution to address these problems in terms of data and methodology.
SHALA-LLM: Smartly Handling Ambiguous Labels in Aligning LLMs
arXiv:2606.05376v1 Announce Type: new Abstract: Many human-centered tasks, including natural language inference (NLI) and emotion recognition (ER), have multiple plausible interpretations, leading to label ambiguity and challenging disagreements across human annotators. As LLMs are increasingly deployed in real-world settings, faithfully modeling such ambiguity is essential to identify contested inputs, preserve variability in ambiguous cases, and capture the full distribution of human...
CREP: Cis-Regulatory Element Predictor Based on Fine-Tuned Enformer
A substantial fraction of disease-associated genetic variants reside in non-coding regions of the genome, where they act by perturbing cis-regulatory elements (CREs) such as enhancers, promoters, and insulators. While recent sequence-based deep learning models, such as Enformer, accurately predict continuous epigenomic signals from DNA sequence, they do not directly provide discrete and interpretable CRE annotations. Here, we present CREP (Cis-Regulatory Element Predictor), a fine-tuned...
Decoding Hierarchical Cell-Cell Communication in Spatial Multi-Omics with CellSTIC
Cell-cell communication helps to coordinate tissue development, homeostasis, and immune responses, but identifying signaling interactions within intact tissues remains difficult. Although single-cell transcriptomics has enabled systematic inference of ligand-receptor interactions, dissociation disrupts spatial context and limits the identification of bona fide local signaling and region-specific communication programs. Spatial transcriptomics and spatial multi-omics offer the opportunity to...
Learning Perspectivist Social Meaning via Demographic-Conditioned Fusion Embeddings
arXiv:2606.07123v1 Announce Type: new Abstract: Social meaning in language is inherently perspectival, varying across annotator backgrounds, demographics, and ideological positions. However, most NLP systems collapse this variation into a single ground-truth label, ignoring the diversity of interpretations. In this work, we model social dimensions along a perspectivist spectrum, capturing how interpretations vary across demographic groups on a dataset consisting of 28k human annotations.