Home Knowledge Base Multi-Scale Feature Fusion Module

Multi-Scale Feature Fusion Module

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Step-adaptive multimodal fusion network with multi-scale cloud feature learning for ultra-short-term solar irradiance forecasting

arXiv:2606.06102v1 Announce Type: new Abstract: Ultra-short-term solar irradiance prediction is critical for photovoltaic system dispatch and power grid stability. Existing approaches suffer from three key shortcomings: single time-series models cannot capture the spatial dynamics of clouds under complex conditions, standard convolutions inadequately represent multi-scale cloud features, and fixed low-frequency compensation strategies fail to adapt to different prediction steps. To address...

arXiv CS 5d ago

Cross-Modality Feature Fusion Based on Structured State Space Duality for Multimodal Image Registration Network

arXiv:2606.03341v1 Announce Type: new Abstract: In multi-modal image registration, the primary challenge lies in shared structural information extraction. Compared to Transformers, Structured State Space Duality (SSD) offers greater global structural feature extraction with higher efficiency during training and inference. Inspired by these advantages, we propose a novel algorithm for multi-modal image registration, named RegNetMamba-2.

arXiv CS 7d ago

SemDINO: A DINOv3-Driven Network for Cross-Temporal Semantic Alignment in Change Detection

arXiv:2606.09772v1 Announce Type: new Abstract: Semantic change detection (SCD) aims to simultaneously locate land-cover changes and identify semantic categories before and after transition. However, existing methods suffer from insufficient cross-temporal alignment, weak multi-scale representation, and poor robustness to pseudo-changes caused by illumination, season, and registration noise. To address these issues, we propose a novel end-to-end semantic change detection network named...

arXiv CS 1d ago

FMRFusion: Frequency-Aware Multi-View Representation Learning for Heterogeneous Image Fusion

Announce Type: new Abstract: Infrared and visible image fusion aims to generate a composite image that retains significant target information and preserves detailed textures, integrating two heterogeneous modalities. Previous image fusion methods typically adopt a single-module stacking approach to extract features from the two modalities. However, these approaches may result in incomplete learning of their distinct characteristics, thereby limiting the fusion effectiveness and constrain ing...

arXiv CS 1d ago

Med-URWKV{\dag}: Toward Enhanced Pretrained Pure VRWKV Models for Medical Image Segmentation

arXiv:2506.10858v2 Announce Type: replace-cross Abstract: Medical image segmentation is a fundamental task in computer-aided diagnosis and treatment. Existing approaches based on CNNs, ViTs, Mamba, and hybrid models still suffer from limitations such as restricted receptive fields, high computational cost, or insufficient accuracy. Recently, Vision Receptive-field Weighted Key-Value (VRWKV) models have emerged as a promising alternative,delivering strong long-range dependency modeling for...

arXiv CS 8d ago

Edge-Aware and Content-Adaptive Infrared Gas Leak Detection for Industrial Safety Monitoring

Announce Type: replace Abstract: Infrared gas leak detection is important for industrial safety and environmental monitoring, but automatic detection remains challenging because gas plumes are often faint, small, semi-transparent, and weakly bounded. This paper proposes an Edge-Aware and Content-Adaptive Feature Fusion Detector (ECAF-Det) for weak-plume detection in cluttered thermal scenes. ECAF-Det integrates three task-oriented designs: a plume-oriented local-global feature enhancement...

arXiv CS 7d ago

LiteViLNet: Lightweight Vision-LiDAR Fusion Network for Efficient Road Segmentation

arXiv:2605.21007v2 Announce Type: replace Abstract: Road segmentation is a fundamental perception task for autonomous driving and intelligent robotic systems, requiring both high accuracy and real-time inference, especially for deployment on resource-constrained edge devices. Existing multi-modal road segmentation methods often rely on heavy transformer-based encoders to achieve state-of-the-art performance, but their enormous computational cost prohibits real-time deployment on embedded...

arXiv CS 9d ago

Attention-Guided Autoencoder Fusion for Insulator Defect Detection Using UAV Transmission-Line Imaging

arXiv:2606.06536v1 Announce Type: new Abstract: Automated defect detection in high-voltage transmission-line insulators remains challenging due to severe class imbalance, large scale variation, and the small spatial extent of defect instances in Unmanned Aerial Vehicle (UAV) imagery. To address these challenges, this paper proposes AE-YOLO, an Attention-Guided AutoEncoder-Enhanced YOLO framework for robust insulator defect detection. The architecture integrates lightweight bottleneck...

arXiv CS 2d ago

Personalized 3D Myocardial Infarct Geometry Reconstruction from Cine MRI for Cardiac Digital Twins

arXiv:2606.01808v1 Announce Type: new Abstract: Accurate 3D geometric characterization of myocardial infarction (MI) is essential for building cardiac digital twins (CDTs) to precisely simulate infarct-related electrophysiology. Late gadolinium enhancement magnetic resonance imaging (LGE MRI) is the clinical reference for locating MI, yet its reliance on contrast agents restricts use in renally impaired patients and limits longitudinal follow-ups. As an alternative, contrast-free cine MRI...

arXiv CS 8d ago

EIVE: End-to-End Instance-Specific Visual Explanations for Detection Transformers

arXiv:2606.01601v1 Announce Type: new Abstract: Visual explainability for object detection remains challenging due to the multi-instance nature of detection. Existing approaches predominantly adopt post-hoc paradigms, such as gradient-based or perturbation-based explanation methods, to interpret pretrained detectors. However, these methods require additional gradient computation or repeated model inference, resulting in limited efficiency.

arXiv CS 8d ago