Medical AI
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
MultiTurnPSB: Evaluating Multi-Turn Jailbreak Attacks an dClassifier-Based Defenses for Medical AI Safety
arXiv:2606.02630v1 Announce Type: new Abstract: Patient-facing medical chatbots are commonly evaluated on single-turn prompts, yet real users push back after refusals, add urgency, and invoke authority. We introduce MultiTurnPSB, a four-turn adversarial extension of PatientSafetyBench, and evaluate GPT-4.1-mini under fixed template, template-adaptive, and live adversarial attacks. Unsafe responses rise from 35% to nearly 80% by Turn 4 under live attack.
Evaluating Factual Density in Multi-Source RAG: A Study in Medical AI Accuracy
arXiv:2605.31506v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) is the current industry standard for grounding AI in real-world facts. Traditional retrieval methods rely on keyword matching and topic proximity, ranking content based on how closely it sounds like the user's query. What they do not measure is how many verified facts the content actually contains.
Truth, Trust, and Trouble: Medical AI on the Edge
Announce Type: replace Abstract: Large Language Models (LLMs) hold significant promise for transforming digital health by enabling automated medical question answering. However, ensuring these models meet critical industry standards for factual accuracy, usefulness, and safety remains a challenge, especially for open-source solutions. We present a rigorous benchmarking framework using a dataset of over 1,000 health questions.
AutoMedBench: Towards Medical AutoResearch with Agentic AI Models
arXiv:2606.01961v1 Announce Type: new Abstract: Autonomous agents are increasingly expected to support end-to-end medical-AI research workflows, moving beyond isolated prediction tasks or short-form clinical question answering. However, existing medical agent benchmarks primarily evaluate final outputs, providing limited visibility into agent behavior within the research process. To address this gap, we present AutoMedBench, a workflow-aware benchmark for autonomous medical-AI research...
AutoMedBench: Towards Medical AutoResearch with Agentic AI Models
arXiv:2606.01961v2 Announce Type: replace Abstract: Autonomous agents are increasingly expected to support end-to-end medical-AI research workflows, moving beyond isolated prediction tasks or short-form clinical question answering. However, existing medical agent benchmarks primarily evaluate final outputs, providing limited visibility into agent behavior within the research process. To address this gap, we present AutoMedBench, a workflow-aware benchmark for autonomous medical-AI research...
ReclAIm: A Multi-Agent Framework for Monitoring and Correcting Performance Decline in Medical Imaging AI
Announce Type: replace Abstract: Purpose: To develop and evaluate a multi-agent framework (ReclAIm) for automated monitoring, detection, and correction of performance decline in medical image classification models. Materials and Methods: ReclAIm is a large language model-based multi-agent system that operates through natural language interaction. A master agent coordinating three task-specific agents performed performance evaluation and triggered fine-tuning when substantial performance...
AI: Doctors risk being sued if tools go wrong, while companies are “shielded,” report warns
Doctors are being left exposed to legal claims by a “widening gulf” between the law and the rapidly changing use of artificial intelligence (AI) in healthcare, the Medical Protection Society (MPS) has warned. In its report Closing the AI Liability Gap,1 the medical defence organisation said that doctors and the NHS were currently expected to absorb all legal responsibility for AI use in healthcare, while AI companies were “shielded.”Under
Clinical Reasoning in the Age of AI: Longitudinal Cognition and Human-AI Collaboration
arXiv:2606.08442v1 Announce Type: new Abstract: As physicians turn to AI-powered systems to help meet the dual demands of speed and care quality, they are met with hallucinations and sycophancy. Understanding how doctors reason through clinical problems in real-world settings is critical for design of effective AI reasoning systems. While recent advances in medical AI have emphasized performance benchmarks and diagnostic accuracy, comparatively little attention has been paid to the structure...
Doctors and NHS could be sued for mistakes made by AI tools, report warns
Medical Protection Society calls for law to be overhauled to help medics avoid liability for errors made by technologyDoctors and the NHS could be sued for medical negligence over mistakes made by artificial intelligence tools used in diagnosing patients and suggesting their treatment, ministers are being warned. Under the law as it stands, medics and the health service can be held liable for patients being harmed or dying even if it was AI that made the errors that resulted in their suffering.
Doctors and NHS could be sued for mistakes made by AI tools, report warns
Medical Protection Society calls for law to be overhauled to help medics avoid liability for errors made by technologyDoctors and the NHS could be sued for medical negligence over mistakes made by artificial intelligence tools used in diagnosing patients and suggesting their treatment, ministers are being warned. Under the law as it stands, medics and the health service can be held liable for patients being harmed or dying even if it was AI that made the errors that resulted in their suffering.