Home Knowledge Base WINDQuant

WINDQuant

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

WINDQuant: Weight-Informed Neural Decision-Making for Global Mixed-Precision LLM Quantization

arXiv:2605.26660v2 Announce Type: replace Abstract: Quantization is an effective approach to reduce the memory footprint and inference cost of large language models (LLMs), yet maintaining performance in the ultra-low-bit regime remains challenging. Existing post-training methods often suffer from severe accuracy degradation, while quantization-aware training requires costly retraining and additional resources.

arXiv CS 8d ago