Health
Autonomous AI screening flags unreliable Lyme test results, boosting sensitivity to 95.7%
Key Points
Autonomous AI screening flags unreliable Lyme test results, boosting sensitivity to 95.7% Andrew Zinin Lead Editor Computational point-of-care sensors can significantly improve access to diagnostics by enabling rapid patient testing outside centralized medical facilities. These tests rely on machine learning models to make diagnostic predictions, but such inference models are susceptible to hallucinations and may produce erroneous outcomes. As a result, their limited reliability has...
Autonomous AI screening flags unreliable Lyme test results, boosting sensitivity to 95.7%
Andrew Zinin
Lead Editor
Computational point-of-care sensors can significantly improve access to diagnostics by enabling rapid patient testing outside centralized medical facilities. These tests rely on machine learning models to make diagnostic predictions, but such inference models are susceptible to hallucinations and may produce erroneous outcomes. As a result, their limited reliability has partially hindered the broader adoption of computational sensors in health care settings.
To address this, researchers at UCLA have developed an uncertainty quantification framework that autonomously identifies and excludes unreliable neural network predictions in computational point-of-care diagnostic platforms. As a testbed, they used a paper-based vertical flow assay for rapid diagnosis of Lyme disease—the most common tick-borne disease worldwide.
How the assay works
The vertical flow assay platform combines a disposable paper-based assay, a handheld smartphone-based optical reader and a deep learning inference algorithm to deliver Lyme disease diagnostic results in under 20 minutes using only a droplet of patient serum. Unlike conventional single-analyte rapid tests that display a single test line, the vertical flow assay contains a multiplexed panel of 25 spots, each coated with proteins that specifically bind to antibodies developed in response to Lyme exposure.
After activation, the assay produces a rich image-based signal pattern that is interpreted by a neural network, which delivers a binary Lyme-positive or Lyme-negative result.
Measuring prediction uncertainty
To improve the reliability of this vertical flow assay platform, UCLA researchers developed an uncertainty quantification framework based on Monte Carlo dropout (MCDO). In this framework, each sample is processed not only by the baseline diagnostic neural network, but also by 1,000 additional models, termed MCDO models. These models share the same architecture and trained weights as the baseline model, but during inference, a random subset of their neurons is temporarily deactivated, producing a distribution of 1,000 additional predictions.
By comparing this distribution with the baseline disease prediction, the researchers introduced a score termed the uncertainty figure of merit, which autonomously quantifies the reliability of each diagnostic result.
Tests with low figure-of-merit values are deemed unreliable and automatically flagged as "Do not use." Such tests are excluded from clinical decision-making. Tests with high figure-of-merit values are labeled "Trust," and their diagnostic predictions can be reported to clinicians for follow-up care and treatment.
Improving Lyme test sensitivity
With this MCDO-based approach, the researchers specifically aimed to reduce false-negative predictions—cases in which a patient with Lyme disease is incorrectly diagnosed as Lyme-negative. A missed Lyme infection due to poor sensitivity of the test can lead to serious physiological and neurological complications that may persist for months or even years.
Diagnostic sensitivity measures how well a test detects positive cases and avoids false-negative predictions: The higher the sensitivity, the fewer infections are missed. In blinded testing on new patient samples, applying the uncertainty quantification pipeline improved the sensitivity of the vertical flow assay from 88.2% to 95.7%, while preserving 100% specificity.
"Our framework identifies unreliable test results in a fully autonomous way, without the need to know the patient's true diagnosis," said Prof. Aydogan Ozcan, who supervised the research.
"Monte Carlo dropout is especially well-suited for point-of-care settings because it delivers meaningful reliability estimates at very low computational cost, with no extra memory or hardware required," Ozcan added.
Broader potential for diagnostics
The team validated the framework using samples from two independent biobanks, including the Lyme Disease Biobank and the U.S. Centers for Disease Control and Prevention, demonstrating that it can generalize across different collection sites and time periods.
Although demonstrated here for Lyme disease, the approach is designed to work with any rapid diagnostic test processed by a neural network, including tests for other infectious diseases, cardiovascular conditions and various clinical biomarker panels.
"Our MCDO-based uncertainty quantification pipeline can be integrated with any computational point-of-care sensor that relies on a neural network-based inference model," said Dr. Artem Goncharov, first author of the study.
Publication details
Artem Goncharov et al, Autonomous Uncertainty Quantification for Computational Point-of-Care Sensors, ACS Nano (2026). DOI: 10.1021/acsnano.6c06616
Journal information: ACS Nano