Home Knowledge Base TwinQuant

TwinQuant

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

TwinQuant: Learnable Subspace Decomposition for 4-Bit LLM Quantization

arXiv:2606.01556v1 Announce Type: new Abstract: 4-bit quantization reduces the memory footprint and latency of large language model inference, but its aggressive precision reduction can severely degrade accuracy. Prior methods address this by decomposing each weight matrix into two components (e.g., via singular value decomposition) and quantizing them separately, assigning the bulk of values to a low-precision residual component while handling outliers with a high-precision low-rank...

arXiv CS 8d ago