Home Knowledge Base Medical AI

Medical AI

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

MultiTurnPSB: Evaluating Multi-Turn Jailbreak Attacks an dClassifier-Based Defenses for Medical AI Safety

arXiv:2606.02630v1 Announce Type: new Abstract: Patient-facing medical chatbots are commonly evaluated on single-turn prompts, yet real users push back after refusals, add urgency, and invoke authority. We introduce MultiTurnPSB, a four-turn adversarial extension of PatientSafetyBench, and evaluate GPT-4.1-mini under fixed template, template-adaptive, and live adversarial attacks. Unsafe responses rise from 35% to nearly 80% by Turn 4 under live attack.

arXiv CS 7d ago

Evaluating Factual Density in Multi-Source RAG: A Study in Medical AI Accuracy

arXiv:2605.31506v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) is the current industry standard for grounding AI in real-world facts. Traditional retrieval methods rely on keyword matching and topic proximity, ranking content based on how closely it sounds like the user's query. What they do not measure is how many verified facts the content actually contains.

arXiv CS 9d ago

Truth, Trust, and Trouble: Medical AI on the Edge

Announce Type: replace Abstract: Large Language Models (LLMs) hold significant promise for transforming digital health by enabling automated medical question answering. However, ensuring these models meet critical industry standards for factual accuracy, usefulness, and safety remains a challenge, especially for open-source solutions. We present a rigorous benchmarking framework using a dataset of over 1,000 health questions.

arXiv CS 8d ago

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

arXiv:2606.01961v1 Announce Type: new Abstract: Autonomous agents are increasingly expected to support end-to-end medical-AI research workflows, moving beyond isolated prediction tasks or short-form clinical question answering. However, existing medical agent benchmarks primarily evaluate final outputs, providing limited visibility into agent behavior within the research process. To address this gap, we present AutoMedBench, a workflow-aware benchmark for autonomous medical-AI research...

arXiv CS 8d ago

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

arXiv:2606.01961v2 Announce Type: replace Abstract: Autonomous agents are increasingly expected to support end-to-end medical-AI research workflows, moving beyond isolated prediction tasks or short-form clinical question answering. However, existing medical agent benchmarks primarily evaluate final outputs, providing limited visibility into agent behavior within the research process. To address this gap, we present AutoMedBench, a workflow-aware benchmark for autonomous medical-AI research...

arXiv CS 6d ago

ReclAIm: A Multi-Agent Framework for Monitoring and Correcting Performance Decline in Medical Imaging AI

Announce Type: replace Abstract: Purpose: To develop and evaluate a multi-agent framework (ReclAIm) for automated monitoring, detection, and correction of performance decline in medical image classification models. Materials and Methods: ReclAIm is a large language model-based multi-agent system that operates through natural language interaction. A master agent coordinating three task-specific agents performed performance evaluation and triggered fine-tuning when substantial performance...

arXiv CS 2d ago

AI: Doctors risk being sued if tools go wrong, while companies are “shielded,” report warns

Doctors are being left exposed to legal claims by a “widening gulf” between the law and the rapidly changing use of artificial intelligence (AI) in healthcare, the Medical Protection Society (MPS) has warned. In its report Closing the AI Liability Gap,1 the medical defence organisation said that doctors and the NHS were currently expected to absorb all legal responsibility for AI use in healthcare, while AI companies were “shielded.”Under

BMJ (British Medical Journal) 1d ago

Clinical Reasoning in the Age of AI: Longitudinal Cognition and Human-AI Collaboration

arXiv:2606.08442v1 Announce Type: new Abstract: As physicians turn to AI-powered systems to help meet the dual demands of speed and care quality, they are met with hallucinations and sycophancy. Understanding how doctors reason through clinical problems in real-world settings is critical for design of effective AI reasoning systems. While recent advances in medical AI have emphasized performance benchmarks and diagnostic accuracy, comparatively little attention has been paid to the structure...

arXiv CS 1d ago

Doctors and NHS could be sued for mistakes made by AI tools, report warns

Medical Protection Society calls for law to be overhauled to help medics avoid liability for errors made by technologyDoctors and the NHS could be sued for medical negligence over mistakes made by artificial intelligence tools used in diagnosing patients and suggesting their treatment, ministers are being warned. Under the law as it stands, medics and the health service can be held liable for patients being harmed or dying even if it was AI that made the errors that resulted in their suffering.

The Guardian Health 1d ago

Doctors and NHS could be sued for mistakes made by AI tools, report warns

Medical Protection Society calls for law to be overhauled to help medics avoid liability for errors made by technologyDoctors and the NHS could be sued for medical negligence over mistakes made by artificial intelligence tools used in diagnosing patients and suggesting their treatment, ministers are being warned. Under the law as it stands, medics and the health service can be held liable for patients being harmed or dying even if it was AI that made the errors that resulted in their suffering.

The Guardian UK 1d ago