Are LLMs Ready for Neural-integrated Mechanistic Modeling? A Benchmark and Agentic Framework

arXiv CS Tuesday 02 June 2026, 04:00 UTC By Zihan Guan, Rituparna Datta, Mengxuan Hu, Shunshun Liu, Aiying Zhang, Prasanna Balachandran, Sheng Li, Anil Vullikanti 1 min read

Key Points

arXiv:2602.18008v2 Announce Type: replace Abstract: Large language models (LLMs) have shown promise in constructing mechanistic models from data. However, existing evaluations largely focus on simplified settings and fail to capture the complexity of real-world scientific modeling. In practice, such modeling often involves neural-integrated formulations, where a mechanistic model component and a neural network component are jointly constructed, leading to a significantly more complex search space. Motivated by this gap, we introduce the Neural-Integrated Mechanistic Modeling (NIMM) benchmark, which evaluates LLM-generated neural-integrated mechanistic models across three scientific domains. Experiments on NIMM reveal that existing LLM-based approaches struggle to effectively explore this complex space, resulting in limited search stability and solution quality. To address this challenge, we propose NIMMGen, a tree-guided agentic framework that enables diversified exploration via branch-level search and improves solutions through atomic model refinement. Extensive experiments demonstrate that NIMMGen achieves state-of-the-art performance on NIMM, significantly improving search stability and solution quality.

Mechanistic Modeling (ORG) Agentic Framework (ORG) the Neural-Integrated Mechanistic Modeling (ORG) LLM (ORG) NIMM (ORG)

Originally published by arXiv CS Read original →

Electric vehicle sales might be better now than the end of last year when demand fell off a cliff following the surge of purchases ahead of the end of the federal financial incentives, but it's clear they haven't panned out as well as many in the automotive industry had hoped. Still, at a GM event Ars attended in San Francisco this week, the company continues to stick to its guns with an EV lineup spanning its brands. The automaker shared that it has also been working toward the adoption of...

Ars Technica 1h ago

Worker bees build a 'royal palace' for the honeybee queen

Worker bees build a 'royal palace' for the honeybee queen June 10 : Honeybee queens come from the same ordinary fertilized female eggs as worker bees. So how does one bee become a queen - with the responsibility of serving as the colony's only baby maker - rather than just another worker? Until now, scientists believed it was solely because the chosen bee was served a special diet.

Channel News Asia 1h ago

Starlink rival Qianfan hits satellite milestone, but is it too slow and costly?

Starlink rival Qianfan hits satellite milestone, but is it too slow and costly? Constellation now has 201 satellites in orbit but the company is said to be under pressure to ramp up launches The constellation now has 201 satellites after a successful launch on board a Zhuque-2E rocket from the Gobi Desert at 4.23pm Beijing time on Tuesday. The mission delivered Qianfan DTC-01 – a direct-to-cell test satellite – alongside a satellite from China Mobile, state broadcaster CCTV reported.

South China Morning Post 1h ago

Insta360's Luna Ultra takes on DJI's Osmo Pocket gimbal cameras

Insta360's Luna Ultra takes on DJI's Osmo Pocket gimbal cameras The camera, which has a detachable screen, will be available starting today for $770. Insta360 has launched Luna Ultra, a direct competitor to DJI's Osmo Pocket gimbal camera lineup primarily meant for vlogging and travel documentation.

Engadget 1h ago

Are LLMs Ready for Neural-integrated Mechanistic Modeling? A Benchmark and Agentic Framework

Related Stories

GM Energy introduces V2G support and new energy storage battery chemistry

Worker bees build a 'royal palace' for the honeybee queen

Starlink rival Qianfan hits satellite milestone, but is it too slow and costly?

Insta360's Luna Ultra takes on DJI's Osmo Pocket gimbal cameras