Segment Alignment
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Segment, Embed, and Align: A Universal Recipe for Aligning Subtitles to Signing
arXiv:2512.08094v2 Announce Type: replace Abstract: The goal of this work is to develop a universal approach for aligning subtitles (i.e., spoken language text with corresponding timestamps) to continuous sign language videos. Prior approaches typically rely on end-to-end training tied to a specific language or dataset, which limits their generality. In contrast, our method Segment, Embed, and Align (SEA) provides a single framework that works across multiple languages and domains.
Rethinking Efficient Crack Segmentation with Task-Aligned Structural-Directional Modeling
arXiv:2605.31048v1 Announce Type: new Abstract: Recent crack segmentation methods often follow generic semantic segmentation designs, using stronger backbones, hybrid CNN-Transformer-Mamba encoders, and auxiliary enhancement branches. Although effective, this raises whether stronger generic feature mixing is the most suitable direction for crack segmentation. We instead formulate crack segmentation as sparse structural recovery.
Segment-driven Structural Induction and Semantic Alignment for Heterogeneous Tabular Representation
arXiv:2606.01890v1 Announce Type: new Abstract: Real-world domains often contain heterogeneous tables whose headers vary while their underlying attribute semantics are shared, making it difficult to induce domain-specialized semantics from table-local evidence alone. Existing encoders model parts of this problem, but often underuse column-level value distributions and apply uniform objectives across attributes with different semantic roles. We propose NAVI, a segment-centric pretraining...
Beyond Gaussian Statistics in Polymer Melts: Statistical Masking of Persistent Local Constraints
arXiv:2605.25989v2 Announce Type: replace-cross Abstract: Short polymer chains exhibit clear deviations from Gaussian end-to-end distance statistics, yet the molecular mechanism by which Gaussian behavior is recovered in long chains remains unestablished. Atomistic molecular dynamics simulations of polyethylene melts reveal that conformational heterogeneity persists at the Kuhn scale across all chain lengths, consisting of a mosaic of slow-relaxing, extended aligned chain segments (ACS) and...
Correlative SHG-AFM imaging workflow for label-free quantitative analysis of collagen structure-function relationships
We present a user-friendly correlative second harmonic generation (SHG) and atomic force microscopy (AFM) imaging workflow for quantifying the nanomechanical properties of collagen in unfixed, unlabeled tissue sections. SHG Aligned Profiling for Elasticity and Segmentation or SHAPES utilizes SHG imaging to guide AFM force mapping, enabling label-free, anatomically specific selection of regions of interest, facilitating spatially resolved characterization of fibrous collagen morphology and...
PairWise Image Finder: An Open-source Tool for Finding Visually Aligned Street-Level Image Pairs for Urban Perception Studies
arXiv:2606.08795v1 Announce Type: new Abstract: Change detection and scene recognition techniques have been widely applied to Street View Imagery (SVI) to understand changes in scenes across the years. However, metadata alone is often insufficient to reliably find visually aligned image pairs. This study introduces the PairWise image finder, a tool that integrates feature detection and matching, supported by semantic segmentation masks to quantify the visual alignment of two images of...
CR-Seg: Attention-Guided and CoT-Enhanced Coarse-to-Refined Reasoning Segmentation
Announce Type: replace Abstract: Reasoning segmentation aims to segment target objects described by complex language through joint visual-textual reasoning. Existing methods typically rely on either learned semantic tokens to bridge Multimodal Large Language Models (MLLMs) and segmentation models, suffering from difficult cross-modal alignment, or explicit spatial prompts such as bounding boxes, which may lose holistic response semantics. To address these limitations, we propose...
\textsc{CR-Seg}: Attention-Guided and CoT-Enhanced Coarse-to-Refined Reasoning Segmentation
Announce Type: new Abstract: Reasoning segmentation aims to segment target objects described by complex language through joint visual-textual reasoning. Existing methods typically rely on either learned semantic tokens to bridge Multimodal Large Language Models (MLLMs) and segmentation models, suffering from difficult cross-modal alignment, or explicit spatial prompts such as bounding boxes, which may lose holistic response semantics. To address these limitations, we propose Attention-Guided...
SegTune: Structured and Fine-Grained Control for Song Generation
arXiv:2606.02638v1 Announce Type: new Abstract: Recent advances in neural song generation have enabled high-quality synthesis from lyrics and global textual prompts. However, most systems fail to model temporally varying attributes of songs, severely limiting fine-grained control over musical structure and dynamics. To address this, we propose SegTune, a Diffusion Transformer-based framework enabling structured and fine-grained controllability by allowing users or large language models...
Massively Multilingual Joint Segmentation and Glossing
Announce Type: replace Abstract: Automated interlinear gloss prediction with neural networks is a promising approach to accelerate language documentation efforts. However, while state-of-the-art models like GlossLM achieve high scores on glossing benchmarks, user studies with linguists have found critical barriers to the usefulness of such models in real-world scenarios. In particular, existing models typically generate morpheme-level glosses but assign them to whole words without predicting...