Home › Knowledge Base › Masked Segment Modeling

Masked Segment Modeling

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

MedVeriSeg: Teaching LISA-Like Medical Segmentation Models to Verify Query Validity Without Extra Training

Announce Type: replace Abstract: Despite recent progress in text-prompt-based medical image segmentation, existing LISA-like MLLM-based methods typically generate masks regardless of whether the target specified in the query is present, leading to hallucinated segmentation. In this work, we propose MedVeriSeg, a training-free query verification framework that enables LISA-like medical segmentation models to reject false segmentation queries. MedVeriSeg first quantifies the response quality...

arXiv CS 1d ago

3D Segment Anything Model with Visual Mamba for Diagnosing Placenta Accreta Spectrum

Announce Type: replace Abstract: Placenta Accreta Spectrum (PAS) is a rare but highly dangerous obstetric disease. Early and accurate PAS diagnosis is critical for maternal health. Traditional PAS diagnosis relies on experienced doctors by analyzing the cesarean history and Magnetic Resonance Imaging (MRI) data.

arXiv CS 7d ago

SAMatcher: Co-Visibility Modeling with Segment Anything for Robust Feature Matching

Announce Type: new Abstract: Reliable correspondence estimation is a fundamental problem in image processing, underpinning applications such as Structure from Motion, visual localization, and image registration. Existing learning-based methods have significantly improved local feature representations, yet most still operate at the pixel or patch level and lack explicit modeling of regions that are jointly visible across views. We propose SAMatcher, a feature matching framework that...

arXiv CS 7d ago

TopoPult-SSL: Gland-Mask-Free Cross-Device Meibomian Gland Segmentation via Self-Distilled Weak Clinical Priors

arXiv:2606.05347v1 Announce Type: new Abstract: Every new clinical imaging device creates a domain shift where dense gland masks are expensive yet cheap clinical signals -- eyelid outlines, Pult grades, morphometric ratios -- are routinely recorded. We present TopoPult-SSL, a two-stage framework for cross-device meibomian gland segmentation. Stage 1 adapts a source-trained model without target gland masks in the training loss, using four weak-prior anchors driven by target eyelid masks and...

arXiv CS 5d ago

Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

arXiv:2606.09303v1 Announce Type: new Abstract: The rapid development of pretrained foundation models has enabled more general image segmentation. Multimodal large language models (MLLMs) have been widely explored for image segmentation with complex queries that require high-level reasoning. Despite promising progress, existing methods are often constrained by limited training data and the gap between MLLMs and mask generation modules.

arXiv CS 1d ago

dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching

arXiv:2506.06295v3 Announce Type: replace Abstract: Autoregressive Models (ARMs) have long dominated the landscape of Large Language Models. Recently, a new paradigm has emerged in the form of diffusion-based Large Language Models (dLLMs), which generate text by iteratively denoising masked segments. This approach has shown significant advantages and potential.

arXiv CS 1d ago

dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching

arXiv:2506.06295v2 Announce Type: replace Abstract: Autoregressive Models (ARMs) have long dominated the landscape of Large Language Models. Recently, a new paradigm has emerged in the form of diffusion-based Large Language Models (dLLMs), which generate text by iteratively denoising masked segments. This approach has shown significant advantages and potential.

arXiv CS 7d ago

CamoSAM2: SAM2-oriented Prompt Auto-Refinement for Video Camouflaged Object Detection

Announce Type: replace Abstract: The Segment Anything Model 2 (SAM2), a prompt-guided video foundation model, has remarkably performed in video object segmentation, drawing significant attention in the community. Due to the high similarity between camouflaged objects and their surroundings, which makes them difficult to distinguish even by the human eye, the application of SAM2 for automated segmentation in real-world scenarios faces challenges in camouflage perception and reliable prompts...

arXiv CS 1d ago

CR-Seg: Attention-Guided and CoT-Enhanced Coarse-to-Refined Reasoning Segmentation

Announce Type: replace Abstract: Reasoning segmentation aims to segment target objects described by complex language through joint visual-textual reasoning. Existing methods typically rely on either learned semantic tokens to bridge Multimodal Large Language Models (MLLMs) and segmentation models, suffering from difficult cross-modal alignment, or explicit spatial prompts such as bounding boxes, which may lose holistic response semantics. To address these limitations, we propose...

arXiv CS 6d ago

\textsc{CR-Seg}: Attention-Guided and CoT-Enhanced Coarse-to-Refined Reasoning Segmentation

Announce Type: new Abstract: Reasoning segmentation aims to segment target objects described by complex language through joint visual-textual reasoning. Existing methods typically rely on either learned semantic tokens to bridge Multimodal Large Language Models (MLLMs) and segmentation models, suffering from difficult cross-modal alignment, or explicit spatial prompts such as bounding boxes, which may lose holistic response semantics. To address these limitations, we propose Attention-Guided...

arXiv CS 7d ago