Beyond Single-Policy: Evaluating Composed Organization-Specific Policy Alignment in LLM Chatbots

arXiv CS Thursday 04 June 2026, 04:00 UTC By Yingjie Liu, Yongxiang Hu, Xuan Wang, Yilun Li, Yunlei Wei, Xiaoyu Wang, Yangfan Zhou 1 min read

Key Points

Announce Type: new Abstract: Large language model chatbots are increasingly deployed in organizational settings such as healthcare, finance, and public services. Evaluating policy alignment is therefore critical to reliable chatbot deployment. By analyzing real-world user queries, we identify composed-policy violation is prevalent in various chatbots but overlooked by existing benchmarks.

arXiv:2606.04394v1 Announce Type: new Abstract: Large language model chatbots are increasingly deployed in organizational settings such as healthcare, finance, and public services. Evaluating policy alignment is therefore critical to reliable chatbot deployment. By analyzing real-world user queries, we identify composed-policy violation is prevalent in various chatbots but overlooked by existing benchmarks. This paper present COPAL, an automated tool for evaluating composed-policy alignment in chatbots. COPAL efficiently generates queries that trigger composed-policy failures in chatbots via empirically derived interaction patterns and explicit handling contracts. Queries generated by COPAL expose substantial query handling failures: across 9 served models, composed-policy queries yield a 33.1% error rate on average, indicating that composed-policy alignment warrants further investigation.

healthcare (ORG) COPAL (ORG)

Originally published by arXiv CS Read original →

Social Security Administration Commissioner Frank Bisignano told Congress on Wednesday that the agency has improved one legacy pain point for individuals who contact it — long phone wait times for the toll-free helpline. SSA has brought the average "speed of answer," or the time it takes for an agent to answer an incoming call, to the "lowest level in a decade," Bisignano said in written testimony to the House Ways and Means Social Security and Work & Welfare subcommittee hearing. In May,...

CNBC 13m ago

First home buyers left scrambling as stamp duty exemption ends

Tasmania's first home buyer stamp duty exemption is ending, leaving some scrambling to settle before June 30 Thu 11 Jun 2026 at 7:16am In short: Tasmania's free stamp duty scheme for first home buyers comes to an end this month, but some Tasmanians are being caught in an anxious wait to meet the cut-off. Heith Mineur is grappling with a months-long, drawn-out process to purchase his first home with the government's shared equity scheme and worries he will have to pay $25,000 in stamp duty if...

ABC Australia 18m ago

The limits of self-funding: From the Politics Desk

Welcome to From the Politics Desk, a daily newsletter that brings you the NBC News Politics team’s latest reporting and analysis from the White House, Capitol Hill and the campaign trail. In today’s edition, Ben Kamisar takes stock of the half-billion dollars Tom Steyer has spent over the course of two unsuccessful bids for office. Plus, Andrea Mitchell digs into the latest back-and-forth between the U.S. and Iran.

NBC News 23m ago

Bill Gates tells Epstein hearing he 'never victimised anyone'

Bill Gates tells Epstein hearing he 'never victimised anyone' Billionaire Bill Gates told US lawmakers he “never victimised anyone” and said his meetings with Jeffrey Epstein were for philanthropic discussions that he later ended. Microsoft co-founder Bill Gates denied Wednesday (Jun 10) that he had "victimised anyone" as he began closed-door testimony to US lawmakers over his relationship with notorious sex offender Jeffrey Epstein. Gates, one of the world's richest men and a leading...

Channel News Asia 27m ago

Beyond Single-Policy: Evaluating Composed Organization-Specific Policy Alignment in LLM Chatbots

Related Stories

Bisignano says Social Security Administration's phone helpline wait times have reached a record low

First home buyers left scrambling as stamp duty exemption ends

The limits of self-funding: From the Politics Desk

Bill Gates tells Epstein hearing he 'never victimised anyone'