Aryabhata 2: Scaling Reinforcement Learning for Advanced STEM Reasoning

arXiv CS Thursday 04 June 2026, 04:00 UTC By Ritvik Rastogi, Vishal Singh, Tejas Chaudhari, Sandeep Varma 1 min read

Key Points

arXiv:2605.28829v2 Announce Type: replace Abstract: Competitive STEM examinations such as JEE and NEET require multi-step symbolic reasoning, precise numerical computation, and deep conceptual understanding across physics, chemistry, and mathematics. Recent large language models perform strongly on common reasoning benchmarks, yet they remain difficult to deploy at scale, where millions of student doubts demand domain-specific, consistently structured problem solving. We introduce Aryabhata 2, a reasoning-focused language model for competitive STEM examinations, trained via reinforcement-learning post-training. Using PhysicsWallah's internal question banks, we construct a high-quality training curriculum and post-train GPT-OSS-20B through reinforcement learning with verifiable rewards. Training combines prolonged reinforcement learning with broadened exploration via progressively larger rollout group sizes. We evaluate Aryabhata 2 on competitive examination benchmarks, including JEE Main, JEE Advanced, and NEET, as well as out-of-distribution reasoning datasets such as AIME, HMMT, MMLU-Pro, MMLU-Redux 2.0, and GPQA. Results show that Aryabhata 2 outperforms its base model GPT-OSS-20B on competitive STEM reasoning while requiring substantially fewer output tokens (up to 64\% fewer).

Aryabhata 2 (ORG) PhysicsWallah (ORG) JEE Main (PERSON) AIME (ORG) MMLU-Pro (ORG)

Originally published by arXiv CS Read original →

Prof Kathy Willis responds to research showing that the poorest areas in the country face the deepest cuts to green spacesThe new research covered in your report (England’s poorest areas face deepest cuts to green space under planning law changes, report finds, 4 June) highlights the stark inequalities that exist across England when it comes to accessing nature-rich places and unlocking the many health, wellbeing and economic benefits that they can provide. In short, the research has found...

The Guardian UK 1h ago

The Last Evolution, by John W Campbell Jr. (1932)

The Project Gutenberg EBook of The Last Evolution, by John Wood Campbell This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.org

Hacker News 1h ago

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say In a proof-of-concept lab experiment, scientists demonstrated that intestinal parasites could make and release therapeutic agents inside a living host. Scientists genetically tweaked a tiny, worm-like parasite to produce a life-saving antitoxin from inside a living host. In a first-of-its-kind study, researchers modified the hookworm Ancylostoma ceylanicum so that it produces antibodies that...

Live Science 2h ago

Indonesia Landslides Devastated Endangered Orangutans, Study Finds

More than 5 percent of the species is estimated to have been lost when a climate-fueled storm unleashed torrents of water, mud and debris.

NYT Science 2h ago

Aryabhata 2: Scaling Reinforcement Learning for Advanced STEM Reasoning

Related Stories

Link between poverty and access to nature | Letter

The Last Evolution, by John W Campbell Jr. (1932)

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say

Indonesia Landslides Devastated Endangered Orangutans, Study Finds