the Latent Vulnerability Score
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
When Behavioral Safety Evaluation Fails: A Representation-Level Perspective
Announce Type: new Abstract: Large Language Model (LLM) safety has often been evaluated at the behavior level, which provides limited evidence of internal robustness, as these evaluations target outputs rather than representation-level vulnerability under intervention. We formalize this discrepancy as the audit gap: the difference between behavioral safety and robustness under intervention. To study this gap, we construct dissociated models that preserve safe outward behavior while remaining...
Latent Performance Profiling of Large Language Models
Announce Type: replace Abstract: Large language models (LLMs) frequently achieve impressive scores on standardized benchmarks, yet accuracy alone offers a limited view of their capabilities. Evaluating open-source LLMs through leaderboards faces persistent issues like data contamination, narrow task scope, and weak alignment with real-world reliability. Benchmark-based evaluations such as MMLU PRO, BBH, or IFEval primarily capture what a model outputs on fixed test sets, not how it processes...
Did Claude increase bugs in rsync?
A simple distributional analysis of every rsync release with bug data. Nothing complicated, answers only one question: are the Claude-assisted releases unusually buggy? In order to avoid accuastions of this "just being Claude defending Claude," "AI slop," "probably all hallucinations," etc., I've decided it's probably worth explaining a few key points about how this report was created: In late May 2026, rsync blew up.