Home Health MedVision: Benchmarking Quantitative Medical Image Analysis
Health

MedVision: Benchmarking Quantitative Medical Image Analysis

Key Points

Announce Type: replace Abstract: Current vision-language models (VLMs) in medicine are primarily designed for categorical question answering (e.g., "Is this normal or abnormal?") or qualitative descriptive tasks. However, clinical decision-making often relies on quantitative assessments, such as measuring the size of a tumor or the angle of a joint, from which physicians draw their own diagnostic conclusions. This quantitative reasoning capability remains underexplored and poorly supported...

arXiv:2511.18676v2 Announce Type: replace Abstract: Current vision-language models (VLMs) in medicine are primarily designed for categorical question answering (e.g., "Is this normal or abnormal?") or qualitative descriptive tasks. However, clinical decision-making often relies on quantitative assessments, such as measuring the size of a tumor or the angle of a joint, from which physicians draw their own diagnostic conclusions. This quantitative reasoning capability remains underexplored and poorly supported in existing VLMs. In this work, we introduce MedVision, a large-scale dataset and benchmark specifically designed to evaluate and improve VLMs on quantitative medical image analysis. MedVision spans 22 public datasets covering diverse anatomies and modalities, with 30.8 million image-annotation pairs. We focus on three representative quantitative tasks: (1) detection of anatomical structures and abnormalities, (2) tumor/lesion (T/L) size estimation, and (3) angle/distance (A/D) measurement. We show that current off-the-shelf VLMs perform poorly on these tasks. However, supervised and reinforcement fine-tuning on MedVision significantly enhances performance across detection, T/L estimation, and A/D measurement. MedVision provides a foundation for developing VLMs with robust quantitative reasoning capabilities in medical imaging.
MedVision (ORG) Benchmarking Quantitative Medical Image Analysis arXiv:2511.18676v2 (ORG)
Originally published by arXiv CS Read original →