MedVision: Benchmarking Quantitative Medical Image Analysis

arXiv CS Tuesday 09 June 2026, 04:00 UTC By Yongcheng Yao, Yongshuo Zong, Raman Dutt, Yongxin Yang, Sotirios A Tsaftaris, Timothy Hospedales 1 min read

Key Points

Announce Type: replace Abstract: Current vision-language models (VLMs) in medicine are primarily designed for categorical question answering (e.g., "Is this normal or abnormal?") or qualitative descriptive tasks. However, clinical decision-making often relies on quantitative assessments, such as measuring the size of a tumor or the angle of a joint, from which physicians draw their own diagnostic conclusions. This quantitative reasoning capability remains underexplored and poorly supported...

arXiv:2511.18676v2 Announce Type: replace Abstract: Current vision-language models (VLMs) in medicine are primarily designed for categorical question answering (e.g., "Is this normal or abnormal?") or qualitative descriptive tasks. However, clinical decision-making often relies on quantitative assessments, such as measuring the size of a tumor or the angle of a joint, from which physicians draw their own diagnostic conclusions. This quantitative reasoning capability remains underexplored and poorly supported in existing VLMs. In this work, we introduce MedVision, a large-scale dataset and benchmark specifically designed to evaluate and improve VLMs on quantitative medical image analysis. MedVision spans 22 public datasets covering diverse anatomies and modalities, with 30.8 million image-annotation pairs. We focus on three representative quantitative tasks: (1) detection of anatomical structures and abnormalities, (2) tumor/lesion (T/L) size estimation, and (3) angle/distance (A/D) measurement. We show that current off-the-shelf VLMs perform poorly on these tasks. However, supervised and reinforcement fine-tuning on MedVision significantly enhances performance across detection, T/L estimation, and A/D measurement. MedVision provides a foundation for developing VLMs with robust quantitative reasoning capabilities in medical imaging.

MedVision (ORG) Benchmarking Quantitative Medical Image Analysis arXiv:2511.18676v2 (ORG)

Originally published by arXiv CS Read original →

MedVision: Benchmarking Quantitative Medical Image Analysis

Related Stories

Colts' Pierce could be out well into training camp...

Family of Belfast knife horror issue second appeal for calm amid fears he could lose other eye

'Doctor Who' Christmas Special is cancelled as Russell T. Davies departs the show: What does this mean for the future of Doctor Who?

Gurgaon road rage: Doctor, her husband attacked during confrontation; one arrested