Home Knowledge Base Active Perception

Active Perception

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

ACTIVE-o3: Empowering MLLMs with Active Perception via Pure Reinforcement Learning

arXiv:2505.21457v2 Announce Type: replace Abstract: Active vision, also known as active perception, refers to actively selecting where and how to look in order to gather task-relevant information. It is a critical component of efficient perception and decision-making in humans and advanced embodied agents. With the rise of Multimodal Large Language Models (MLLMs) as central planners in robotic systems, the lack of methods for equipping MLLMs with active perception has become a key gap.

arXiv CS 1d ago

ActiveMimic: Egocentric Video Pretraining with Active Perception

Announce Type: new Abstract: Egocentric human video offers a scalable alternative to robot data for pretraining, yet models pretrained on such video consistently underperform those pretrained on robot data. We attribute this gap to a missing signal, the active perception behavior in egocentric videos, where humans continuously reposition their viewpoint during manipulation, inducing camera motion that standard pipelines treat as noise. To address this, we present ActiveMimic, a pretraining...

arXiv CS 5d ago

Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding

Announce Type: replace Abstract: Long video understanding (LVU) is challenging because answering real-world queries often depends on sparse, temporally dispersed cues buried in hours of mostly redundant and irrelevant content. While agentic pipelines improve video reasoning capabilities, prevailing frameworks rely on a query-agnostic captioner to perceive video information, which wastes computation on irrelevant content and blurs fine-grained temporal and spatial information. Motivated by...

arXiv CS 5d ago

Mesoscopic cortical activities associated with pupil-linked perceptions inferred via explainable machine learning

Pupil dilation reflects arousal-related neural processes and is closely linked to sensory perception, attention, and cognitive state, but the mesoscopic cortical dynamics that accompany stimulus-evoked dilation remain unclear. Here, we combined simultaneous pupillometry and wide-field Ca2+imaging in mice with explainable machine learning to identify cortical activity patterns predictive of pupil dilation. Cortical activity was recorded during hindpaw somatosensory stimulation, visual pattern...

bioRxiv 9d ago

Vision-Language Work Zone Intelligence for Safety-Critical Speed Regulation of Mixed-Autonomy Vehicles in Dynamic Environments

arXiv:2606.08860v1 Announce Type: new Abstract: Temporary work-zone speed limits are communicated through visually inconsistent signage and are often missing from digital maps, creating safety risks for human drivers and automated vehicle systems. We present a real-time, onboard perception pipeline that detects active work zones, recognizes associated temporary speed limits, and outputs a law-aware work-zone state and speed value suitable for driver alerts or downstream automated control.

arXiv CS 1d ago

Extracellular NAD(P) activates systemic acquired resistance through LecRK-VI.2-mediated phosphorylation of NPR1

Systemic acquired resistance (SAR) is a long-lasting, broad-spectrum immune response induced in distal tissues by signals generated at primary infection sites. Although numerous mobile immune signals have been implicated in SAR, how these signals are perceived and mechanistically coupled to transcriptional reprogramming in systemic tissues remains poorly understood. functions as a key integrative SAR signal that activates immunity through the plasma membrane-localized lectin receptor kinase...

bioRxiv 4d ago

Cooperative Circumnavigation for Multiple Unmanned Surface Vehicles Without External Localization

arXiv:2606.04518v1 Announce Type: new Abstract: This paper proposes a cooperative target circumnavigation framework for multiple unmanned surface vehicles (USVs) operating without external localization. The objective is to maintain a uniform circular formation of a specified radius around a target using only limited onboard sensing. The framework adopts a heterogeneous perception strategy that distinguishes between the asymmetric sensing relationships with the target and among the USVs.

arXiv CS 6d ago

Information-dependent eye-hand coordination emerges from active vision

In daily activities, humans rely on visual information to plan hand movements, making the extraction of task-relevant information through eye gaze a key aspect of motor control. Behavioral studies have revealed characteristic saccade-pursuit patterns, likely governed by shared neural circuits, which enable an efficient reduction of task-related uncertainty. However, a unifying computational principle explaining the emergence of these patterns in continuous tasks such as reading or driving is...

bioRxiv 8d ago

PerchRL: Vision-Based Agile Perching on Inclined Platforms under Rapid and Irregular Motion

arXiv:2606.03441v1 Announce Type: new Abstract: Autonomous vision-based perching of quadrotors on moving inclined platforms is critical for air-ground collaboration but remains challenging due to the limited field of view (FOV). In this paper, we propose PerchRL, a reinforcement learning (RL) framework for vision-based agile perching on inclined platforms under rapid and irregular motion. Specifically, we employ a two-stage learning strategy consisting of state-based pre-training followed by...

arXiv CS 7d ago

PerchRL: Vision-Based Agile Perching on Inclined Platforms under Rapid and Irregular Motion

arXiv:2606.03441v2 Announce Type: replace Abstract: Autonomous vision-based perching of quadrotors on moving inclined platforms is critical for air-ground collaboration but remains challenging due to the limited field of view (FOV). In this paper, we propose PerchRL, a reinforcement learning (RL) framework for vision-based agile perching on inclined platforms under rapid and irregular motion. Specifically, we employ a two-stage learning strategy consisting of state-based pre-training...

arXiv CS 6d ago