Home Knowledge Base Wilson CI

Wilson CI

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

I built a vulnerable app and spent $1,500 seeing if LLMs could hack it

I built a vulnerable app and spent $1,500 seeing if LLMs could hack it As a part of my work I do security research for various apps and websites. I wanted to see if LLMs could reproduce a common class of exploits I’ve found in multiple apps. I made a fake React Native app in Expo and a backend in Python.

Hacker News 6d ago

Selection-Aware Diagnostics for Chain-of-Thought Answer Hijacking

arXiv:2606.04717v1 Announce Type: new Abstract: We study a controlled numeric proxy for chain-of-thought (CoT) answer hijacking, motivated by attacks in which benign-looking reasoning steers a harmful final answer. CoT wrappers on GSM8K and MATH-500 flip final answers away from gold labels. Rather than treating activation patching as clean-trace restoration, we ask where hijacked trajectories are fragile and whether recovery depends on a same-problem clean source.

arXiv CS 6d ago

DPBench: Structural Determinants of Multi-Agent LLM Coordination Under Simultaneous Resource Contention

Announce Type: replace Abstract: We present DPBench, a benchmark for evaluating coordination in multi-agent systems built from large language models. Existing benchmarks measure task-level success under a fixed protocol; the structural conditions under which coordination succeeds or fails at all have not been characterised. DPBench adapts the Dining Philosophers problem into a controlled testbed where the action protocol, the communication structure, and the group size each vary independently.

arXiv CS 5d ago

Replicate-anchored calibration of within-host single nucleotide variant detection in Mycobacterium tuberculosis whole genome sequencing

Intra-host genetic heterogeneity in Mycobacterium tuberculosis is biologically and clinically informative, but its detection from short read whole genome sequencing depends on thresholds over read depth (DP), alternate allele support (AD), and minor allele frequency (MAF) that are rarely empirically anchored. We developed a biological replicate-anchored, lexicographic calibration framework for per-specimen intra-host single nucleotide variant (iSNV) detection. Within-patient replicate sputum...

bioRxiv 9d ago

The Coverage Gap: Chile's Cyber Disclosure Framework versus the USA, EU and UK

Announce Type: new Abstract: We introduce the Coverage Gap as a measurable distance between the observable public exposure of critical-infrastructure operators and their declared capability to coordinate vulnerability disclosure. We instantiate it against the 915 Chilean Operadores de Importancia Vital (OIVs -- Operators of Vital Importance) designated by the National Cybersecurity Agency (ANCI) under Ley 21.663 (Resolucion Exenta No. 87, 16 December 2025). Using a passive-only, OSINT-based...

arXiv CS 5d ago