Home › Knowledge Base › Semantic Object Correspondence

Semantic Object Correspondence

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

SOCO: Benchmarking Semantic Object Correspondence in Vision Foundation Models

arXiv:2605.31597v1 Announce Type: new Abstract: Measuring structured object understanding in vision foundation models remains challenging due to inconsistent evaluation protocols and limited part-level supervision. Semantic correspondence (SC) evaluates this capability by testing whether object parts can be matched across instances and categories under large variations in appearance, viewpoint, and geometry. To enable a systematic SC evaluation, we introduce SOCO, a new benchmark for...

arXiv CS 9d ago

SOCO: Benchmarking Semantic Object Correspondence in Vision Foundation Models

arXiv:2605.31597v2 Announce Type: replace Abstract: Measuring structured object understanding in vision foundation models remains challenging due to inconsistent evaluation protocols and limited part-level supervision. Semantic correspondence (SC) evaluates this capability by testing whether object parts can be matched across instances and categories under large variations in appearance, viewpoint, and geometry. To enable a systematic SC evaluation, we introduce SOCO, a new benchmark for...

arXiv CS 8d ago

PlatonicNav: Unveiling Semantic Correspondence in Navigation with Platonic Topological Maps

arXiv:2606.01788v1 Announce Type: new Abstract: Embodied visual navigation, where an agent perceives a complex environment and acts to reach a goal from raw sensory input, underpins a wide range of applications such as household service robotics, assistive robotics, and large-scale autonomous exploration. However, recent attempts to unify vision-and-language navigation (VLN) and object goal navigation (ObjNav) remain at the level of architectural fusion, mixed-task training, and large...

arXiv CS 8d ago

Disambiguation of two-tone images reveals semantic contributions to object recognition in the EEG

Electrophysiological responses to visual objects carry information about stimulus identity and semantic category, but it remains difficult to know whether such information represents semantic knowledge or merely regularities in physical image features. Here, we presented two-tone images while recording EEG to dissociate the learned semantic concept from physical stimulus properties in the electrophysiological signal. Seventeen healthy participants completed a semantic disambiguation...

bioRxiv 5d ago

AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Afford Correspondence

Announce Type: replace Abstract: Despite the recent success of modern imitation learning methods in robot manipulation, their performance is often constrained by geometric variations due to limited data diversity. Leveraging powerful 3D generative models and vision foundation models (VFMs), the proposed AffordGen framework overcomes this limitation by utilizing the semantic correspondence of meaningful keypoints across large-scale 3D meshes to generate new robot manipulation trajectories....

arXiv CS 8d ago

Semantic-weighted ICP for LiDAR Odometry: Class-Aware Residual Reweighting for Robust Scan Registration

Announce Type: new Abstract: LiDAR odometry is a fundamental component of autonomous robotic systems, relying on geometric registration between consecutive point clouds to estimate ego-motion. However, traditional geometric approaches often degrade in dynamic or unstructured environments due to unreliable correspondences caused by moving objects, sparse geometric features, vegetation, and semantically ambiguous structures. Existing works have shown that, some of these limitations can be...

arXiv CS 7d ago

ObjEmbed: Towards Universal Multimodal Object Embeddings

arXiv:2602.01753v3 Announce Type: replace Abstract: Aligning objects with corresponding textual descriptions is a fundamental challenge and a realistic requirement in vision-language understanding. While recent multimodal embedding models excel at global image-text alignment, they often struggle with fine-grained alignment between image regions and specific phrases. In this work, we present ObjEmbed, a novel MLLM embedding model that decomposes the input image into multiple regional...

arXiv CS 8d ago

Internalizing Temporal Consistency in Video Object-Centric Learning without Explicit Regularization

arXiv:2605.31508v1 Announce Type: new Abstract: Video Object-Centric Learning (OCL) aims to represent objects as \textit{slot} vectors and maintain their consistency across frames. Slot-Slot Contrastive (SSC) loss has become the cornerstone for state-of-the-art (SOTA) video OCL methods. While highly effective, SSC relies on one-to-one object correspondence across frames and introduces an extra loss.

arXiv CS 9d ago

Robot-DIFT: Correspondence-Sensitive Diffusion Features for Contact-Rich Robot Manipulation

arXiv:2602.11934v2 Announce Type: replace Abstract: Robot manipulation often fails in the final millimeters: a policy may recognize the right object yet miss the pose offsets, boundaries, or pre-contact alignments needed for action. We argue that such failures arise when semantic invariance suppresses correspondence cues for closed-loop control, or when these cues are not exposed to the policy in a usable form. Modern visual encoders provide strong semantic abstractions, but contact-rich...

arXiv CS 1d ago

ScriptHOI: Learning Scripted State Transitions for Open-Vocabulary Human-Object Interaction Detection

Announce Type: replace Abstract: Open-vocabulary human-object interaction (HOI) detection requires recognizing interaction phrases that may not appear as annotated categories during training. Recent vision-language HOI detectors improve semantic transfer by matching human-object features with text embeddings, but their predictions are often dominated by object affordance and phrase-level co-occurrence. As a result, a model may predict \textit{cut cake} from the presence of a knife and a cake...

arXiv CS 8d ago