Home Knowledge Base ARC-Easy

ARC-Easy

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Toxic HallucinAItions: Perturbing Prompts and Tracing LLM Circuits

arXiv:2605.30913v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed in conversational settings where user tone ranges from polite to adversarial or toxic, yet less is known about whether toxic language in otherwise semantically equivalent prompts can degrade factual reliability. We study how lexical and tone-based prompt perturbations affect the factual reliability of LLMs. Using controlled prompt variations across polite, random, and three toxicity levels,...

arXiv CS 9d ago