GraspFoM: Towards Reconstruction-Driven Robotic Grasping with 3D Foundation Priors

arXiv CS Tuesday 09 June 2026, 04:00 UTC By Dongli Wu, Xiaobao Wei, Hao Wang, Qiaochu Dong, Ying Li, Qingpo Wuwu, Ming Lu, Wufan Zhao 1 min read

Key Points

arXiv:2606.08440v1 Announce Type: new Abstract: Robotic grasping is a fundamental capability in robotic manipulation. Yet grasping remains challenging under partial observations.

arXiv:2606.08440v1 Announce Type: new Abstract: Robotic grasping is a fundamental capability in robotic manipulation. Yet grasping remains challenging under partial observations. Reliable grasping depends on both local contact cues and object-level 3D structure. Existing geometry-aware grasping methods recognize the value of reconstruction, but they typically treat geometry as an intermediate prediction rather than a reusable object prior for grasping. In this paper, we present GraspFoM, a unified framework that leverages 3D foundation priors (SAM3D) to build a shared 3D object latent for both reconstruction and grasp pose prediction. Built on this shared object latent, we introduce an anchor-initialized truncated pose-reasoning diffuser that predicts continuous and multimodal grasp poses without directly relying on discrete grasp candidates. We further investigate the interaction between reconstruction and grasping through a reconstruction-aware scorer and a residual latent updater. Reconstruction provides grounded geometric cues, while grasp supervision refines the shared object latent toward grasp-relevant affordances. GraspFoM jointly predicts grasp poses and reconstructs high-fidelity 3D assets in mesh and 3DGS forms. Comprehensive experiments demonstrate that GraspFoM achieves state-of-the-art results on both reconstruction and grasping. Notably, these improvements require only a small number of additional trainable parameters. Component-wise ablation studies also demonstrate the contribution of each component.

Reconstruction (ORG) GraspFoM (ORG)

Originally published by arXiv CS Read original →

GraspFoM: Towards Reconstruction-Driven Robotic Grasping with 3D Foundation Priors

Related Stories

Nearly a million passports and photo IDs were left unprotected on the public internet

Trump Muses About Government Taking a Piece of A.I. Companies

Microsoft's Xbox plans for major layoffs next month, Bloomberg News reports

Xbox warns of a &#8216;reset&#8217; as it prepares for layoffs

Xbox warns of a ‘reset’ as it prepares for layoffs