Home Knowledge Base Stereo Transformer

Stereo Transformer

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

StereoPolicy: Improving Robotic Manipulation Policies via Stereo Perception

arXiv:2605.09989v2 Announce Type: replace Abstract: Recent advances in robot imitation learning have produced powerful visuomotor policies that manipulate diverse objects from visual inputs. However, monocular observations lack depth information, which is critical for precise manipulation in cluttered or geometrically complex scenes. Explicit depth maps and point clouds are often noisy and fragile in real-world manipulation.

arXiv CS 5d ago

RCM-ACT: Imitation Learning with Dynamic RCM Calibration for Autonomous Intraocular Foreign Body Removal

arXiv:2508.19191v3 Announce Type: replace Abstract: Intraocular foreign body removal demands millimeter-level precision in confined intraocular spaces, yet existing robotic systems predominantly rely on manual teleoperation with steep learning curves. To address the challenges of autonomous manipulation, particularly kinematic uncertainties from variable motion scaling and Remote Center of Motion (RCM) point variation, we propose RCM-ACT, an imitation learning framework for autonomous...

arXiv CS 8d ago

HORUS: A Mixed Reality Interface for Managing Teams of Mobile Robots

arXiv:2506.02622v2 Announce Type: replace Abstract: Mixed Reality (MR) interfaces have been extensively explored for controlling mobile robots, but there is limited research on their application to managing teams of robots. This paper presents HORUS: Holistic Operational Reality for Unified Systems, a Mixed Reality interface offering a comprehensive set of tools for managing multiple mobile robots simultaneously. HORUS enables operators to monitor individual robot statuses, visualize sensor...

arXiv CS 2d ago

Unpaired RGB-Thermal Gaussian-Splatting Using Visual Geometric Transformers

arXiv:2606.05491v1 Announce Type: new Abstract: Multi-modal novel view synthesis (NVS) combining RGB and thermal imagery enables precise 3D scene reconstruction with visual and thermal information. However, existing methods typically rely on precisely calibrated RGB-thermal image pairs or stereo setups, limiting scalability and practical deployment.

arXiv CS 5d ago

A Survey of 3D Reconstruction with Event Cameras

Announce Type: replace Abstract: Event cameras are rapidly emerging as powerful vision sensors for 3D reconstruction, uniquely capable of asynchronously capturing per-pixel brightness changes. Compared to traditional frame-based cameras, event cameras produce sparse yet temporally dense data streams, enabling robust and accurate 3D reconstruction even under challenging conditions such as high-speed motion, low illumination, and extreme dynamic range scenarios. These capabilities offer...

arXiv CS 8d ago

PlayStation Architecture

Supporting imagery A quick introduction Sony knew that 3D hardware could get very messy to develop for. Thus, their debuting console will keep its design simple and practical… Although this may come at a cost!

Hacker News 7d ago

Magenta RealTime 2: Open and Local Live Music Models

We’re excited to share Magenta RealTime 2 (MRT2), a state-of-the-art open model and efficient real-time inference engine that enables you to build and play AI musical instruments on your laptop! To get started, download the apps on your MacBook (requires Apple Silicon). Unlike other large generative music models that work offline to turn a prompt into a track, MRT2 is a live, interactive model that you can control with MIDI and audio, in addition to text.

Hacker News 5d ago

The Apple Car Is Finally Here

Title: The Apple Car Is Finally Here Transportation has never been a Ferrari’s real purpose. Sure, you can drive one—although not literally you, because you probably can’t afford one. For the few who can, it is an automobile to be seen idling at a stoplight before prancing away, or parked at a luxury-hotel valet stand, inspiring desire and jealousy. For normal people, a Ferrari is a symbol: of power, control, precision, and wealth—but also of the longing for those virtues, and of the idea that they are virtues in the first place. The Ferrari is the quintessential bedroom-poster car, captured in a glossy photo pinned on a wall in a teenage boy’s bedroom like a photo of a scantily clad woman: an unachievable object of desire.If a Ferrari is an object of spectacle, an Apple device is an object of function. The Apple product, whether it’s a laptop, music player, smartphone, tablet, speaker, or watch, is designed to dissolve into its context and melt into ordinary life. Frictionless, intuitive, and transparent—in its ideal form, an Apple product ceases to feel like an object at all, and instead facilitates an activity. An iPhone or MacBook

The Atlantic 12d ago