Home › Knowledge Base › Contrastive Augmented Transformer

Contrastive Augmented Transformer

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Contrastive Augmented Transformer with Domain-specific Enhancement for Robust Multi-scenario Metal Surface Defect Detection

arXiv:2606.01962v1 Announce Type: new Abstract: Metal surface defect detection is critical for maintaining product quality in industrial manufacturing. However, it faces significant challenges, including limited annotated data, difficulty in identifying subtle multi-scale defects, and poor generalization across diverse scenarios. To address these issues, this paper proposes a novel Contrastive Augmented Transformer (CAT) framework for robust defect detection.

arXiv CS 8d ago

Contrastive Augmented Transformer with Domain-specific Enhancement for Robust Multi-scenario Metal Surface Defect Detection

arXiv:2606.01962v2 Announce Type: replace Abstract: Metal surface defect detection is critical for maintaining product quality in industrial manufacturing. However, it faces significant challenges, including limited annotated data, difficulty in identifying subtle multi-scale defects, and poor generalization across diverse scenarios. To address these issues, this paper proposes a novel Contrastive Augmented Transformer (CAT) framework for robust defect detection.

arXiv CS 7d ago

WAV: Multi-Resolution Block Residual Routing for Deep Decoder-Only Transformers

Announce Type: new Abstract: Residual connections are central to training deep Transformers, but standard PreNorm residual streams aggregate sublayer updates with fixed unit weights. Recent Attention Residuals replace this fixed accumulation with content-dependent depth-wise routing, and Block Attention Residuals make the mechanism efficient by routing over block-level residual summaries. However, a single block summary stores only the low-frequency total residual displacement inside a...

arXiv CS 2d ago

Domain Adaptation with a Single Vision-Language Embedding

Announce Type: replace Abstract: Domain adaptation has been extensively investigated in computer vision but still requires access to target data at the training time, which might be difficult to obtain in real-world autonomous driving scenarios, especially under rare or adverse conditions. In this paper, we present a new framework for domain adaptation relying on a single Vision-Language (VL) latent embedding instead of full target data. First, leveraging a contrastive language-image...

arXiv CS 8d ago

BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding

Announce Type: new Abstract: Learning representations of CAD models is a largely open problem. While 3D representation learning has flourished around point clouds and meshes, the native format of CAD - boundary representations BReps, which encodes exact parametric surfaces, curves, and their topology, has received little attention as a representation learning substrate.

arXiv CS 5d ago

Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining

arXiv:2509.22468v2 Announce Type: replace Abstract: High-quality molecular representations are essential for property prediction and molecular design, yet large labeled datasets remain scarce. While self-supervised pretraining on molecular graphs has shown promise, many existing approaches either depend on hand-crafted augmentations or complex generative objectives, and often rely solely on 2D topology, leaving valuable 3D structural information underutilized. To address this gap, we...

arXiv CS 7d ago

MedSyn2: Flexible Control of 3D CT Generation via Text and Semantically-Defined Segmentation Prompts

arXiv:2606.00967v3 Announce Type: replace Abstract: Generative models for volumetric medical images have found many applications in medical imaging, ranging from data augmentation to serving as priors for inverse problems. For these applications, generating high-resolution 3D images with strong controllability is essential but remains highly challenging. Existing approaches typically control generation either through radiology reports used as text prompts or through full image segmentation.

arXiv CS 1d ago

MedSyn2: Flexible Control of 3D CT Generation via Text and Semantically-Defined Segmentation Prompts

Announce Type: replace Abstract: Generative models for volumetric medical images have found many applications in medical imaging, ranging from data augmentation to serving as priors for inverse problems. For these applications, generating high-resolution 3D images with strong controllability is essential but remains highly challenging. Existing approaches typically control generation either through radiology reports used as text prompts or through full image segmentation.

arXiv CS 6d ago