SliceLine
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Decoy-Calibrated Failure Audits for Language Models
arXiv:2606.09046v1 Announce Type: new Abstract: Useful audits reveal not only how often a model fails, but also where its failures concentrate. An auditor may test many candidate explanations: long inputs, indirect questions, distracting evidence, or combinations of these factors. The risk is selection.