a Fusion State Space Model
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Light-WAM: Efficient World Action Models with State-Fusion Action Decoding
arXiv:2606.08242v1 Announce Type: new Abstract: World Action Models (WAMs) extend robot policy learning by incorporating future prediction as an additional training objective, encouraging the policy to encode task-relevant temporal structure in its representations. Current WAMs often rely on large-scale generative architectures that incur high training costs and inference latency, making them difficult to deploy as efficient closed-loop policies. We propose Light-WAM, a lightweight World...
Cross-Modality Feature Fusion Based on Structured State Space Duality for Multimodal Image Registration Network
arXiv:2606.03341v1 Announce Type: new Abstract: In multi-modal image registration, the primary challenge lies in shared structural information extraction. Compared to Transformers, Structured State Space Duality (SSD) offers greater global structural feature extraction with higher efficiency during training and inference. Inspired by these advantages, we propose a novel algorithm for multi-modal image registration, named RegNetMamba-2.
Forget Attention: Importance-Aware Attention Is All You Need
arXiv:2606.02332v2 Announce Type: replace Abstract: Combining attention's global retrieval with the sequential importance signal of state space models (SSMs) is the open challenge of hybrid language modeling. Transformers see everywhere but cannot prioritize; SSMs know what matters but cannot revisit.
Forget Attention: Importance-Aware Attention Is All You Need
Announce Type: new Abstract: Combining attention's global retrieval with the sequential importance signal of state space models (SSMs) is the open challenge of hybrid language modeling. Transformers see everywhere but cannot prioritize; SSMs know what matters but cannot revisit. Existing hybrids -- Jamba (block level) and Hymba (head level) -- place the two in separate compartments, so neither informs the other during the attention computation itself.
ReaLM: Residual Quantization Bridging Knowledge Graph Embeddings and Large Language Models
arXiv:2510.09711v2 Announce Type: replace Abstract: Large Language Models (LLMs) have recently emerged as a powerful paradigm for Knowledge Graph Completion (KGC), offering strong reasoning and generalization capabilities beyond traditional embedding-based approaches. However, existing LLM-based methods often struggle to fully exploit structured semantic representations, as the continuous embedding space of pretrained KG models is fundamentally misaligned with the discrete token space of...
A Unified LLM-Adaptable Framework for Cold-Start Cognitive Diagnosis
arXiv:2505.21239v2 Announce Type: replace Abstract: Cognitive Diagnosis has become a critical task in AI-empowered education, supporting personalized learning by accurately assessing students' cognitive states. However, traditional cognitive diagnosis models (CDMs) often struggle in cold-start scenarios due to the lack of student-exercise interaction data. Recent NLP-based approaches leveraging pre-trained language models (PLMs) have shown promise by utilizing textual features, but they fail...
3D Segment Anything Model with Visual Mamba for Diagnosing Placenta Accreta Spectrum
Announce Type: replace Abstract: Placenta Accreta Spectrum (PAS) is a rare but highly dangerous obstetric disease. Early and accurate PAS diagnosis is critical for maternal health. Traditional PAS diagnosis relies on experienced doctors by analyzing the cesarean history and Magnetic Resonance Imaging (MRI) data.
Vision-Language Guided Hyperspectral Object Tracking via Semantics Fusion and Contextual Template Updating
arXiv:2606.09167v1 Announce Type: new Abstract: Hyperspectral object tracking (HOT) leverages the rich spectral information provided by hyperspectral videos (HSVs), offering substantial potential for object tracking. However, efficiently extracting and exploiting spectral information from redundant spectral bands remains a fundamental challenge, which severely limits model generalization and tracking performance.
Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks
Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks This post is a high-level explainer for my Master’s thesis, which involves designing hardware architectures for ultrafast inference and online learning using the Kolmogorov-Arnold Network (KAN) architecture. I’ll assume familiarity with standard machine learning concepts, as well as some understanding of hardware and digital circuits; read my previous post here for the latter. Please read the two papers below for more...
EEG-Based Multimodal Learning via Hyperbolic Mixture-of-Curvature Experts
arXiv:2604.12579v3 Announce Type: replace Abstract: Electroencephalography (EEG)-based multimodal learning integrates brain signals with complementary modalities to improve mental state assessment, providing great clinical potential. The effectiveness of such paradigms largely depends on the representation learning on heterogeneous modalities. For EEG-based paradigms, one promising approach is to leverage their hierarchical structures, as recent studies have shown that both EEG and...