Home Knowledge Base NuScenes

NuScenes

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

IAF-Net: Illumination-Adaptive Fusion for Low-Light Urban Road Segmentation

Announce Type: new Abstract: Semantic road segmentation is important for autonomous driving, but existing methods suffer severe performance degradation under low-light conditions. Many existing multi-modal fusion methods do not explicitly adapt to illumination-dependent changes in modality reliability, which can propagate degraded RGB features into the fused representation at night. We propose IAF-Net (Illumination-Adaptive Fusion Network), an end-to-end framework with illumination-adaptive...

arXiv CS 9d ago

PLAN-S: Bridging Planning with Latent Style Dynamics for Autonomous Driving World Models

Announce Type: new Abstract: Latent world models (LWMs) have strengthened end-to-end autonomous driving by forecasting compact scene dynamics for downstream planning. However, existing LWM-based planners usually generate trajectories directly from entangled latent representations. This compact latent-to-planner pathway lacks explicit modeling of risk, drivability, and diverse style preferences, making driving-style dynamics difficult to supervise, inspect, or modulate before a final...

arXiv CS 5d ago

Learned Non-Maximum Suppression for 3D Object Detection

arXiv:2606.03568v1 Announce Type: new Abstract: Post-processing is a critical stage in LiDAR-based 3D object detection, where dense and overlapping proposals must be filtered for compact and reliable perception. This work introduces two learned filtering modules that replace heuristic non-maximum suppression (NMS) by leveraging relations among detections. D2D-Rescore employs transformer-based detection-to-detection (D2D) attention, while GossipNet3D adapts the 2D GossipNet concept to 3D...

arXiv CS 7d ago

Can BEV Perception Gracefully Degrade under Sensor Failures?

arXiv:2605.30983v1 Announce Type: new Abstract: Despite the remarkable success of multi-modal bird's-eye view (BEV) perception in autonomous driving, current systems exhibit a critical vulnerability: existing fusion mechanisms are highly brittle to sensor corruptions, often causing catastrophic performance degradation. This vulnerability largely stems from the fact that standard fusion frameworks typically integrate multi-modal representations in a static manner, leading to a precipitous...

arXiv CS 9d ago

PillarDETR: YOLO-Backbone and RT-DETR Head for Real-Time 3D Object Detection

Announce Type: new Abstract: Real-time 3D object detection is a critical component for the safe operation of autonomous driving systems and robotics. While LiDAR point clouds provide accurate spatial information, processing them efficiently remains a significant challenge. Traditional methods rely on complex 3D convolutions or anchor-based paradigms that struggle to balance detection accuracy with inference speed.

arXiv CS 8d ago

Towards Compact Autonomous Driving Perception with Balanced Learning and Multi-sensor Fusion

Announce Type: new Abstract: We present a novel compact deep multi-task learning model to handle various autonomous driving perception tasks in one forward pass. The model performs multiple views of semantic segmentation, depth estimation, light detection and ranging (LiDAR) segmentation, and bird's eye view projection simultaneously without being supported by other models. We also provide an adaptive loss weighting algorithm to tackle the imbalanced learning issue that occurred due to...

arXiv CS 7d ago

Co-Fusion4D: Spatio-temporal Collaborative Fusion for Robust 3D Object Detection

Announce Type: replace Abstract: In autonomous driving, 3D object detection is essential for accurate perception and reliable decision-making. However, object motion and ego-motion often induce cross-frame spatiotemporal inconsistencies in BEV-based detectors, leading to temporal BEV feature misalignment and degraded spatiotemporal consistency. To address these challenges, we propose Co-Fusion4D, a unified framework that explicitly preserves cross-frame spatiotemporal consistency and...

arXiv CS 8d ago

DVGT: Driving Visual Geometry Transformer

Announce Type: replace Abstract: Perceiving and reconstructing 3D scene geometry from visual inputs is crucial for autonomous driving. However, there still lacks a driving-targeted dense geometry perception model that can adapt to different scenarios and camera configurations. To bridge this gap, we propose a Driving Visual Geometry Transformer (DVGT), which reconstructs a global dense 3D point map from a sequence of unposed multi-view visual inputs.

arXiv CS 6d ago

SparseStreet: Sparse Gaussian Splatting for Real-Time Street Scene Simulation

arXiv:2606.03909v1 Announce Type: new Abstract: While 3D Gaussian Splatting has shown promising results in street scene reconstruction, existing methods require massive numbers of Gaussian primitives to capture fine details, leading to prohibitive storage costs and slow rendering speeds. We observe that dynamic objects (e.g., vehicles and pedestrians) demand high-fidelity representations to maintain temporal consistency, while static background regions often contain substantial redundancy....

arXiv CS 7d ago

UnsOcc: 3D Semantic Occupancy Prediction in Unstructured Scene via Rendering Fusion

arXiv:2606.03581v1 Announce Type: new Abstract: Unstructured scenes present unique challenges for autonomous driving, as irregular obstacles and sparse scene layouts undermine the effectiveness of traditional perception methods such as 3D object detection. 3D semantic occupancy prediction has emerged as a prominent focus due to its ability to provide dense spatial representations by assigning semantic labels to individual voxels in 3D space. However, directly applying 3D semantic occupancy...

arXiv CS 7d ago