Beyond Individual Personas: Aligning Synthetic Dialogue to Population-Level Behavior Distributions

arXiv CS Tuesday 09 June 2026, 04:00 UTC By Xinyi Liu, Rinat Khaziev, Hooshang Nayyeri, Emine Yilmaz, Charith Peris, Hari Thadakamalla 1 min read

Key Points

arXiv:2606.07893v1 Announce Type: new Abstract: Synthetic dialogue corpora are increasingly used as proxies for target dialogue data, yet persona-grounded generators optimize individual conversations rather than corpus composition, yielding locally plausible dialogues with distorted population-level behavior mixes. We introduce GroupPersona, a framework that aligns synthetic dialogue corpora to the behavior distribution of a reference corpus. GroupPersona turns population statistics into generation controls: it separates each dialogue's core behavioral signature from predictable side effects, and uses the resulting behavioral groups to condition user agents on the interaction patterns that define the reference population. We evaluate GroupPersona on four corpora crossing two dialogue sources, assistant-style and Reddit-derived, with two construction variants: structure-preserving and variation-enhanced. GroupPersona lowers Jensen-Shannon divergence between synthetic and reference distributions over 12 behavior attributes from 0.234 to 0.177 relative to the strongest average baseline, a 24.4% reduction, and is best or tied-best on all four corpora while preserving structural alignment. It also achieves the closest calibration to reference-conversation quality scores, reducing mean absolute deviation from the reference-conversation profile to 0.63 versus 0.91 for the next-best baseline.

GroupPersona (ORG) Reddit (LOCATION) Jensen-Shannon (ORG)

Originally published by arXiv CS Read original →

Beyond Individual Personas: Aligning Synthetic Dialogue to Population-Level Behavior Distributions

Related Stories

SpaceX courts Australian investors as government warns Elon Musk risk

Residents say Brisbane's new outer city estates missing crucial service

SpaceX Price Tag is 'Very Steep': Renaissance's Kennedy

World's biggest whale graveyard found in Indian Ocean off Australia