Home Knowledge Base SVD LLM

SVD LLM

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

SigmaScale: LLM Compression with SVD-based Low-Rank Decomposition and Learned Scaling Matrices

Announce Type: new Abstract: We present SigmaScale, a method for learning auxiliary scaling matrices $S$ to aid truncated Singular Value Decomposition (SVD) based Large Language Model (LLM) compression. Instead of deriving scaling matrices analytically, SigmaScale optimizes two sets of vectors that define diagonal row and column scaling transformations under an activation-aware compression loss. We show that learned scaling lowers the effective intrinsic rank of weight matrices, as reflected...

arXiv CS 2d ago

Cross-Layer Subspace Coupling for LLM Compression: A Unifying Framework and Its Empirical Limits

Announce Type: new Abstract: Recent SVD based compression methods for large language models like SVD LLM and Basis Sharing can be unified under one optimization problem. While mathematical proofs and tests on Pythia models show this unified approach improves weight reconstruction error by up to 46% percent it fails in practical tasks.

arXiv CS 9d ago

Cross-Layer Subspace Coupling for LLM Compression: A Unifying Framework and Its Empirical Limits

arXiv:2605.30836v2 Announce Type: replace Abstract: Recent SVD based compression methods for large language models like SVD LLM and Basis Sharing can be unified under one optimization problem. While mathematical proofs and tests on Pythia models show this unified approach improves weight reconstruction error by up to 46% percent it fails in practical tasks.

arXiv CS 1d ago

Swift-SVD: Theoretical Optimality Meets Practical Efficiency in Low-Rank LLM Compression

Announce Type: replace Abstract: The deployment of Large Language Models is constrained by the memory and bandwidth demands of static weights and dynamic Key-Value cache. SVD-based compression provides a hardware-friendly solution to reduce these costs. However, existing methods suffer from two key limitations: some are suboptimal in reconstruction error, while others are theoretically optimal but practically inefficient.

arXiv CS 1d ago

Scaling Self-Evolving Agents via Parametric Memory

arXiv:2606.04536v1 Announce Type: new Abstract: Existing memory-augmented LLM agents store past experience exclusively in prompt space, as textual summaries or retrieved passages, while keeping model parameters frozen throughout a rollout. Such agents can \emph{look up} what they have seen but cannot \emph{learn from} it: their policy is unchanged by experience, and any information dropped from the context is permanently lost. We introduce \texttt{TMEM}, a self-evolving parametric memory...

arXiv CS 6d ago

Re-examining Low Rank adaptation for private LLM fine-tuning

arXiv:2510.01137v3 Announce Type: replace Abstract: Privacy is a central concern when fine-tuning large language models (LLMs) on sensitive data, and differentially private stochastic gradient descent (DP-SGD) -- which clips per-sample gradients and adds calibrated Gaussian noise -- is the standard tool for formal privacy guarantees. Both theory and practice show that lower-rank models are better suited to DP training, a property especially relevant for LLMs, whose fine-tuning gradients...

arXiv CS 9d ago