Home › Knowledge Base › Bird's-Eye-View

Bird's-Eye-View

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

MB-Loc: Multi-planar Bird's-eye-view Localization in outdoor LiDAR scenes

Announce Type: new Abstract: Global LiDAR localization is a fundamental task for autonomous navigation systems. Recent methods perform Scene Coordinate Regression (SCR) and achieve superior accuracy over Absolute Pose Regression (APR) solutions by predicting dense 3D world coordinates. However, SCR approaches introduce two major bottlenecks: severe computational inefficiency from processing raw 3D geometries and significant performance degradation under varying sensor viewpoints.

arXiv CS 1d ago

PathPainter: Transferring the Generalization Ability of Image Generation Models to Embodied Navigation

arXiv:2605.07496v2 Announce Type: replace Abstract: Bird's-eye-view (BEV) images have been widely demonstrated to provide valuable prior information for navigation. Given the global information provided by such views, two key challenges remain: how to fully exploit this information and how to reliably use it during execution. In this paper, we propose a navigation system that uses BEV images as global priors and is designed for ground and near-ground robotic platforms.

arXiv CS 2d ago

RESBev: Making BEV Perception More Robust

arXiv:2603.09529v2 Announce Type: replace Abstract: Bird's-eye-view (BEV) perception has emerged as a cornerstone of autonomous driving systems, providing a structured, ego-centric representation critical for downstream planning and control. However, real-world deployment faces challenges from sensor degradation and adversarial attacks, which can cause severe perceptual anomalies and ultimately compromise the safety of autonomous driving systems. To address this, we propose a resilient and...

arXiv CS 8d ago

BEV-ODOM2: Enhanced BEV-based Monocular Visual Odometry with PV-BEV Fusion and Dense Flow Supervision for Ground Robots

Announce Type: replace Abstract: Scale-consistent ego-motion estimation is fundamental for autonomous ground robots. Bird's-Eye-View (BEV) representation naturally addresses the scale drift problem of monocular visual odometry (MVO) by providing a metric-scaled planar workspace, enabling the simplification of 6-DoF ego-motion to a more robust 3-DoF model. However, existing BEV-based methods suffer from two key limitations: sparse supervision signals from pose-only training, and information...

arXiv CS 7d ago

A 3D Isovist World Model -- Revealing a City's Unseen Geometry and Its Emergent Cross-City Signature

Announce Type: new Abstract: Embodied agents that navigate cities rely on world models that predict how their surroundings will change as they move. But for navigation, what matters is not what the buildings look like; it is where the agent can go. Most world models nonetheless predict appearance, learning how a scene looks rather than the space an agent can move through.

arXiv CS 7d ago

A 3D Isovist World Model -- Revealing a City's Unseen Geometry and Its Emergent Cross-City Signature

Announce Type: replace Abstract: Embodied agents that navigate cities rely on world models that predict how their surroundings will change as they move. But for navigation, what matters is not what the buildings look like; it is where the agent can go. Most world models nonetheless predict appearance, learning how a scene looks rather than the space an agent can move through.

arXiv CS 6d ago

Dexterity-BEV: Aligning 3D World and Actions for Generalizable Robot Policies Learning

Announce Type: new Abstract: End-to-end manipulation policies, combined with web-scale pretrained Vision-Language Models (VLMs), show the promise for generalizable and dexterous robotic manipulation. However, they inherit two key limitations from 2D foundation models: 1) the reliance on 2D RGB inputs that ignores the intrinsically 3D nature of manipulation; and 2) the lack of spatial 3D alignment between input-output spaces as well as across diverse robot embodiments, camera setups, and...

arXiv CS 8d ago

Dexterity-BEV: Aligning 3D World and Actions for Generalizable Robot Policies Learning

Announce Type: replace Abstract: End-to-end manipulation policies, combined with web-scale pretrained Vision-Language Models (VLMs), show the promise for generalizable and dexterous robotic manipulation. However, they inherit two key limitations from 2D foundation models: 1) the reliance on 2D RGB inputs that ignores the intrinsically 3D nature of manipulation; and 2) the lack of spatial 3D alignment between input-output spaces as well as across diverse robot embodiments, camera setups, and...

arXiv CS 1d ago

Z-FLoc: Zero-Shot Floorplan Localization via Geometric Primitives

Announce Type: new Abstract: Visual localization -- estimating a camera pose within a pre-existing map -- is a fundamental problem in computer vision. Floorplans are an attractive map representation: they are readily available for most buildings, compact, and inherently invariant to visual appearance changes. However, bridging the severe domain gap between camera observations and floorplan geometry remains challenging.

arXiv CS 6d ago