Home › Knowledge Base › Vision-Based

Vision-Based

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

AgenticDiffusion: Agentic Diffusion-based Path Planning for Vision-Based UAV Navigation

arXiv:2606.04111v1 Announce Type: new Abstract: Indoor UAV navigation requires efficient exploration, scene understanding, and reliable trajectory execution under limited field-of-view observations. Existing vision-based navigation frameworks typically rely on single-view observations, limiting their ability to reason about occlusions, target visibility, and global scene structure. In this work, we propose AgenticDiffusion, a multi-view UAV navigation framework that coordinates...

arXiv CS 6d ago

PerchRL: Vision-Based Agile Perching on Inclined Platforms under Rapid and Irregular Motion

arXiv:2606.03441v1 Announce Type: new Abstract: Autonomous vision-based perching of quadrotors on moving inclined platforms is critical for air-ground collaboration but remains challenging due to the limited field of view (FOV). In this paper, we propose PerchRL, a reinforcement learning (RL) framework for vision-based agile perching on inclined platforms under rapid and irregular motion. Specifically, we employ a two-stage learning strategy consisting of state-based pre-training followed by...

arXiv CS 7d ago

PerchRL: Vision-Based Agile Perching on Inclined Platforms under Rapid and Irregular Motion

arXiv:2606.03441v2 Announce Type: replace Abstract: Autonomous vision-based perching of quadrotors on moving inclined platforms is critical for air-ground collaboration but remains challenging due to the limited field of view (FOV). In this paper, we propose PerchRL, a reinforcement learning (RL) framework for vision-based agile perching on inclined platforms under rapid and irregular motion. Specifically, we employ a two-stage learning strategy consisting of state-based pre-training...

arXiv CS 6d ago

Generalization of World Models under Environmental Variability for Vision-based Quadrotor Navigation

arXiv:2606.05015v1 Announce Type: new Abstract: World models, learned generative models that predict how an environment evolves, have become a promising tool for sample-efficient robot learning. Yet how robust they are to environmental variability remains poorly understood.

arXiv CS 6d ago

Vision-Based Localization in Dense Urban Environments: A Case Study of an Urban Village in China

Announce Type: new Abstract: Urban villages, the widespread informal settlements which have emerged as a result of rapid urbanization, are now major residential hubs for migrant workers in large cities in China. The dense arrangement of buildings in these areas often leads to unreliable GPS signals, while incomplete mapping data further impairs accurate route planning and navigation. These issues not only hinder everyday mobility but also pose significant challenges for emergency response,...

arXiv CS 9d ago

Synthetic Data Generation and Vision-based Wrinkle and Keypoint Detection for Bimanual Cloth Manipulation

arXiv:2606.06292v1 Announce Type: new Abstract: Robotic manipulation of textiles remains challenging because continuous deformation and self-occlusions hinder the robust visual perception required to estimate the cloth's state. To address the lack of annotated real-world data, we developed a Blender-based synthetic pipeline exporting auto-annotated keypoints, and combined manually labeled renders with real-world data to train a wrinkle detector. We present a perception framework integrating...

arXiv CS 5d ago

ViTAMIn-O: Democratizing computer vision-based machine learning for stem cell research

Deep Learning (DL) holds exciting potential in automating the prediction of organoid differentiation results. Nevertheless, current models lack adaptability, openness, and robustness in performance. Additionally, broad employments of predictive models in wet-lab settings necessitate machine learning expertise, often not readily available in biologically oriented laboratories.

bioRxiv 7d ago

A Unified Framework for Probabilistic Dynamic-, Trajectory- and Vision-based Virtual Fixtures

arXiv:2506.10239v3 Announce Type: replace Abstract: Probabilistic Virtual Fixtures (VFs) enable the adaptive selection of the most suitable haptic feedback for each phase of a task, based on learned or perceived uncertainty. While keeping the human in the loop remains essential, for instance, to ensure high precision, partial automation of certain task phases is critical for productivity. We present a unified framework for probabilistic VFs that seamlessly switches between manual fixtures,...

arXiv CS 8d ago

Vision-Based Early Fault Diagnosis and Self-Recovery for Strawberry Harvesting Robots

Announce Type: replace Abstract: Strawberry-harvesting robots faced challenges such as poor visual perception, gripper misalignment, empty grasp/misgrasp, and slippage, which reduced harvesting stability and efficiency. To overcome these issues, this paper proposes a visual fault diagnosis and self-recovery framework. An end-to-end SRR-Net achieved unified perception and fault diagnosis through joint detection, segmentation, and ripeness regression of the fruit and gripper.

arXiv CS 1d ago

DragOn: A Benchmark and Dataset for Drag-Based GUI Interactions

Announce Type: new Abstract: GUI agents - vision-based models that control desktops, web browsers, and mobile devices through graphical user interfaces - promise to automate a wide range of digital tasks. While million-scale datasets have enabled substantial progress on click-grounding, drag grounding (e.g. drag-and-drop, swipe, highlight) data remains an order of magnitude smaller and current models fall short on complex drag-based interactions. We introduce DragOn, a drag grounding...

arXiv CS 5d ago