Home Politics Privacy Policy Enforcement Guardrails for Data-Sensitive...
Politics

Privacy Policy Enforcement Guardrails for Data-Sensitive Retrieval-Augmented Generation

Key Points

arXiv:2605.17034v2 Announce Type: replace Abstract: Standard PII filters often miss contextual data leakage in RAG systems, such as non-regulated attribute clusters that collectively identify individuals. We introduce a Privacy Policy Enforcement (PPE) framework using dual one-class density estimators with fused text embeddings and a calibrated abstain region for out-of-distribution inputs. Using an axis-stratified, multi-LLM synthetic data pipeline across medicine, finance, and law, we...

arXiv:2605.17034v2 Announce Type: replace Abstract: Standard PII filters often miss contextual data leakage in RAG systems, such as non-regulated attribute clusters that collectively identify individuals. We introduce a Privacy Policy Enforcement (PPE) framework using dual one-class density estimators with fused text embeddings and a calibrated abstain region for out-of-distribution inputs. Using an axis-stratified, multi-LLM synthetic data pipeline across medicine, finance, and law, we found that traditional Gaussian Mixture baselines fail on borderline-safe stress tests by focusing on linguistic register rather than content. Our proposed T3+OCSVM detector, trained on safe and borderline-safe data, achieves a borderline AUROC of 0.93+ while reducing false positives by 44-55 percentage points and maintaining millisecond latency. Compared to supervised MLP classifiers or 14B-parameter LLM judges, our framework offers superior operational suitability, as the former suffers from high abstention rates and the latter from latency and calibration issues. This methodology provides a robust stress-testing standard for any synthetic-data-trained classifier.
Privacy Policy Enforcement Guardrails (ORG) Data-Sensitive Retrieval-Augmented Generation arXiv:2605.17034v2 Announce Type (ORG) PPE (ORG) multi-LLM (ORG) LLM (ORG)
Originally published by arXiv CS Read original →