Home › Knowledge Base › Neural Network Compression

Neural Network Compression

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Neural Network Compression by Approximate Differential Equivalence

Announce Type: new Abstract: Neural network compression is commonly achieved by pruning parameters based on local importance scores, e.g., magnitude-based pruning. We propose a complementary approach that compresses models by aggregating neurons with similar functional behavior rather than removing weights independently. Our method encodes a trained network as a polynomial ODE system and applies a lumping method called Approximate Forward Differential Equivalence to identify neurons with...

arXiv CS 8d ago

DBHN-Net: Dual-Branch Hybrid Neural Network For Low-Complexity Monaural Speech Enhancement

arXiv:2606.05911v1 Announce Type: new Abstract: Although artificial neural network (ANN) based speech enhancement (SE) methods demonstrate excellent performance, the high computational complexity and high energy consumption hinder their deployment in practical front-end processing tasks.} Currently, the spiking neural networks (SNNs) have shown potential in reducing power consumption. However, the discrete binary activation and complex spatio-temporal dynamics of SNNs often result in...

arXiv CS 5d ago

Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

Announce Type: replace Abstract: Neural scaling laws underlie many of the recent advances in deep learning, yet their theoretical understanding remains largely confined to linear models. In this work, we present a systematic analysis of scaling laws for quadratic and diagonal neural networks in the feature learning regime. Leveraging connections with matrix compressed sensing and LASSO, we derive a detailed phase diagram for the scaling exponents of the excess risk as a function of sample...

arXiv CS 5d ago

Neural Networks Provably Learn Spectral Representations for Group Composition

arXiv:2606.02993v1 Announce Type: new Abstract: Understanding how structured internal structure emerges during neural network training is central to the study of deep learning. We investigate this phenomenon through the group composition task, where a two-layer neural network is trained to predict $g_1 \star g_2$ for elements of a finite group $G$. By lifting the projected gradient flow to the Fourier domain, we demonstrate that the training dynamics are governed by a Riemannian gradient...

arXiv CS 7d ago

EinSort: Sorting is All We Need for Tensorizing LLM

Announce Type: new Abstract: Tensor networks provide efficient representations for compressing large neural networks. By carefully designing shapes and topologies, they can significantly reduce memory and computational costs. However, identifying implicit low-rank structures in large foundation models remains challenging due to their enormous scale and un-structured weight distributions.

arXiv CS 1d ago

Efficient and accurate neural-field reconstruction using resistive memory

Abstract Applications such as medical imaging, augmented and virtual reality, and embodied artificial intelligence (AI) depend on the ability to reconstruct complex signals from sparse observations. These applications are characterized by incomplete measurements and limited computational resources. Traditional approaches to digital hardware face the following challenges: explicit signal representations require heavy sampling and storage, data movement across the von Neumann bottleneck...

Nature 1d ago

WildCat: Near-Linear Attention in Theory and Practice

arXiv:2602.10056v2 Announce Type: replace Abstract: We introduce WildCat, a high-accuracy, low-cost approach to compressing the attention mechanism in neural networks. While attention is a staple of modern network architectures, it is also notoriously expensive to deploy due to resource requirements that scale quadratically with the input sequence length $n$. WildCat avoids these quadratic costs by only attending over a small weighted coreset. Crucially, we select the coreset using a fast...

arXiv CS 8d ago

GPTQ-intrinsic LoRA: A Near-optimal Algorithm for Low-precision Quantization with Low-rank Adaptation

arXiv:2606.01412v1 Announce Type: new Abstract: Post-training quantization is widely used for compressing large neural networks, but aggressive low-bit quantization can significantly degrade model quality. A common remedy is to augment the quantized weights with a low-rank correction, leading to approximations of the form $W\approx Q+LR$. In this paper, we study this low-precision plus low-rank representation through the layer-wise reconstruction objective $\|XW-X(Q+LR)\|_F^2$, where $X$ is...

arXiv CS 8d ago

CoSeP: Complementary Separability Pruning via Class-Separability Clustering

arXiv:2505.13225v2 Announce Type: replace Abstract: Neural network pruning aims to compress models for efficient deployment, yet two fundamental challenges remain. First, many methods rely on per-component importance scores, selecting filters or neurons independently and ignoring redundancy: the retained set may include multiple components capturing similar discriminative patterns while missing others entirely.

arXiv CS 1d ago

Hyper-Dimensional Fingerprints as Molecular Representations

arXiv:2604.27810v2 Announce Type: replace Abstract: Computational molecular representations underpin virtual screening, property prediction, and materials discovery. Conventional fingerprints are efficient and deterministic but lose structural information through hash-based compression, particularly at low dimensionalities. Learned representations from graph neural networks recover this expressiveness but require task-specific training and substantial computational resources.

arXiv CS 1d ago