Generalized Pose
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Virtual-point-based Solutions to Handle Generalized Absolute Pose Problem
arXiv:2606.09294v1 Announce Type: new Abstract: Multi-camera systems are increasingly adopted in robotics and autonomous navigation for their wide field of view, flexibility, and fault tolerance. Nevertheless, existing PnP solvers fail to handle multiple projection centers. This paper introduces a virtual point formulation that bridges the standard PnP and generalized pose problems, enabling a unified pipeline that transforms existing PnP solvers into generalized pose solvers.
$R^3$: 3D Reconstruction via Relative Regression
Announce Type: replace Abstract: Recent feed-forward geometry foundation models have demonstrated impressive generalization by recovering depth and poses in a single forward pass. However, these models are typically constrained by a global coordinate frame assumption. This dependency becomes a significant bottleneck for long-context and streaming reconstruction, as it forces the network to maintain an arbitrary temporal origin and handle translation magnitudes that grow unbounded over time.
Zero-Shot Polygon Matching with Pre-trained Models for Pose Estimation and Polygon Cloud from Challenging Stereo
Announce Type: replace Abstract: While stereo matching has achieved maturity for 0D point and 1D line primitives, establishing correspondences for 2D polygons remains largely unexplored due to challenges including disparity discontinuity, scale variation, training dependency, and poor generalization, limiting downstream tasks such as pose estimation and 3D reconstruction. To address these issues, we are the first to propose a Zero-shot Polygon Matching paradigm with Pre-trained Models (i.e.,...
DisPOSE: Projected Polystochastic Diffusion for Self-Supervised Multi-View 3D Human Pose Estimation
arXiv:2606.07419v2 Announce Type: replace Abstract: Recovering 3D human poses for multiple individuals from different camera views is a fundamental bottleneck for analyzing interacting behaviors. Existing self-supervised approaches leverage synthetic catalogues of 3D poses; however, this leads to poor generalization in real-world scenarios due to distribution shifts.
DisPOSE: Projected Polystochastic Diffusion for Self-Supervised Multi-View 3D Human Pose Estimation
arXiv:2606.07419v1 Announce Type: new Abstract: Recovering 3D human poses for multiple individuals from different camera views is a fundamental bottleneck for analyzing interacting behaviors. Existing self-supervised approaches leverage synthetic catalogues of 3D poses; however, this leads to poor generalization in real-world scenarios due to distribution shifts. We therefore introduce DisPOSE, a self-supervised framework that approximates the inherently discrete multi-view person-assignment...
'Your AI Text is not Mine': Redefining and Evaluating AI-generated Text Detection under Realistic Assumptions
arXiv:2606.04906v1 Announce Type: new Abstract: Although it is generally agreed that AI-generated text poses a broad societal risk, there is no common understanding in the AI-generated text detection literature on what constitutes harmful use. Rather, existing datasets and approaches often define their own criteria and make their own assumptions, sometimes implicitly, and often only loosely related to real-world needs and applications.
Adversarial Attacks Already Tell the Answer: Directional Bias-Guided Test-time Defense for Vision-Language Models
Announce Type: new Abstract: Vision-Language Models (VLMs), such as CLIP, have shown strong zero-shot generalization but remain highly vulnerable to adversarial perturbations, posing serious risks in real-world applications. Test-time defenses for VLMs have recently emerged as a promising and efficient approach to defend against adversarial attacks without requiring costly large-scale retraining. In this work, we uncover a surprising phenomenon: under diverse input transformations,...
BEAST3D: Animal behavioral analysis and neural encoding from multi-view video via Gaussian splatting
arXiv:2606.02937v1 Announce Type: cross Abstract: Multi-view video recordings are increasingly used to capture the 3D movements of animals in experimental settings, yet extracting rich 3D representations from these recordings remains challenging. Supervised pose estimation requires extensive manual annotation, while general-purpose 3D reconstruction models trained on generic scene datasets fail on the specialized imagery and sparse-view setting of laboratory experiments. We address these...
iMaC: Translating Actions into Motion and Contact Images for Embodied World Models
arXiv:2606.09813v1 Announce Type: new Abstract: Embodied world models have emerged as a pivotal paradigm for visual robotic decision-making and interactive environment simulation. However, conventional embodied frameworks rely on low-dimensional structured action vectors (e.g., joint angles and end-effector poses), which suffer from limited expressive capacity, poor generalization across diverse embodiments, and unnatural dynamic modeling for complex physical interactions. To address these...
As a physician, I know we need to focus on helping the forgotten smoker
As a physician, former member of Congress, and former secretary of the U.S. Department of Health and Human Services, I have spent much of my career focused on policies that improve health outcomes. I have also seen the toll of smoking up close. I lost my father to what I often call "Lucky Strike lungs."