LookWise: Knowing When and Where to Look for Fine-Grained Visual Reasoning in Multimodal Large Language Models

arXiv CS Tuesday 02 June 2026, 04:00 UTC By Yuxiang Shen, Hailong Huang, Zhenkun Gao, Xueheng Li, Man Zhou, Chengjun Xie, Haoxuan Che, Xuanhua He, Jie Zhang 1 min read

Key Points

arXiv:2603.00171v3 Announce Type: replace Abstract: Multimodal Large Language Models (MLLMs) are shifting towards "Thinking with Images" by actively exploring image details. While effective, large-scale training is computationally expensive, which has spurred growing interest in lightweight, training-free solutions. However, existing training-free methods suffer from two flaws: perceptual redundancy from indiscriminate cropping, which increases computational cost and introduces noise; and a drift between semantic intent and spatial attention, which prevents accurate localization of user-focused regions. To address these challenges, we propose LookWise, a framework for adaptive visual reasoning. LookWise follows a two-stage pipeline: a confidence-based module decides when to look more carefully, and a semantic-guided localization module determines where to look. This design enables MLLMs to adaptively acquire fine-grained visual evidence without additional training. Experiments on fine-grained and high-resolution visual reasoning benchmarks show that LookWise consistently improves accuracy over strong baselines while achieving an approximately $4.0\times$ inference speedup over the search-based method ZoomEye, demonstrating robust cross-model generalization.

ZoomEye (ORG)

Originally published by arXiv CS Read original →

Genetics breakthrough could make horned cattle a rarity in Northern Australia Thu 11 Jun 2026 at 9:33am In short: A new test has helped solve the mystery of why some cattle that were clearly born without horns were still returning "horned" results in commercial DNA tests. Researchers at the University of Queensland have identified a previously undetected gene variant in tropical cattle breeds, such as brahmans, solving a mystery that has frustrated producers for years.

ABC Australia 35m ago

Drivers being urged to IGNORE sat nav instructions for 'worrying' reason

Drivers being urged to IGNORE sat nav instructions for 'worrying' reason Road safety charity IAM RoadSmart research has found 54% of drivers have been diverted onto rural roads because of congestion on motorways, dual carriageways and other major A roads. Drivers are being urged to consider ignoring sat nav instructions over fears they could send motorists down more dangerous roads. Road safety charity IAM RoadSmart research has found 54% of drivers have been diverted onto rural roads...

Daily Mirror 50m ago

New species found in Australia's most 'pristine' marine parks

Scientists discover 149 new marine species off Christmas and Cocos (Keeling) Islands Thu 11 Jun 2026 at 9:04am In short: Researchers say they have catalogued at least 149 new species from waters around Christmas and Cocos (Keeling) Islands.

ABC Australia 1h ago

Unix GC Remastered

Introduction The AF_UNIX garbage collector is an interesting piece of the kernel. It exists because sockets can be sent with SCM_RIGHTS but they can become unreachable from user-space while still being kept alive by the kernel, which is not memory efficient; in this situation, the garbage collector intervenes to free them. Not long ago, the subsystem was rewritten from scratch on top of a graph/Strongly-Connected-Components model; but it is still bug prone.

Hacker News 1h ago

LookWise: Knowing When and Where to Look for Fine-Grained Visual Reasoning in Multimodal Large Language Models

Related Stories

Researchers solve 'frustrating' horned cattle mystery

Drivers being urged to IGNORE sat nav instructions for 'worrying' reason

New species found in Australia's most 'pristine' marine parks

Unix GC Remastered