How Can Reinforcement Learning Achieve Expert-level Placement?

arXiv CS Tuesday 02 June 2026, 04:00 UTC By Ruo-Tong Chen, Ke Xue, Chengrui Gao, Yunqi Shi, Tian Xu, Peng Xie, Siyuan Xu, Mingxuan Yuan, Chao Qian, Zhi-Hua Zhou 1 min read

Key Points

Announce Type: replace Abstract: Chip placement is a critical step in physical design. While reinforcement learning (RL)-based methods have recently emerged, their training primarily focuses on wirelength optimization, and therefore often fail to achieve expert-quality layouts. We identify the reward design as the primary cause for the performance gap with experts, and instead of formalizing intricate processes, we circumvent this by directly learning from expert layouts to derive a reward...

arXiv:2604.25191v2 Announce Type: replace Abstract: Chip placement is a critical step in physical design. While reinforcement learning (RL)-based methods have recently emerged, their training primarily focuses on wirelength optimization, and therefore often fail to achieve expert-quality layouts. We identify the reward design as the primary cause for the performance gap with experts, and instead of formalizing intricate processes, we circumvent this by directly learning from expert layouts to derive a reward model. Our approach starts from the final expert layouts to infer step-by-step expert trajectories. Using these trajectories as demonstrations or preferences, we train a model that captures the latent implicit rewards in expert results. Experiments show that our framework can efficiently learn from even a single design and generalize well to unseen cases.

Originally published by arXiv CS Read original →

How Can Reinforcement Learning Achieve Expert-level Placement?

Related Stories

Twin sisters who fought off crocodiles unveil new project to save species that attacked them

SpaceX IPO: What You Need to Know

Waymo built a virtual driver to study how humans react to surprises on the road

Rare tiger cub from litter of four dies