Home Science On Low-Bit Quantization Errors in Speaker Verification:...
Science

On Low-Bit Quantization Errors in Speaker Verification: Diagnostic and Mitigation

Key Points

Announce Type: new Abstract: Although low-bit quantization provides practical means to deploy speaker verification on resource-constrained devices, its effects on speaker verification performance remain poorly understood. In this paper, we study uniform K-means quantization-aware training of ResNet-36 and ResNet-200 through joint layer-wise and score-level analyses.

arXiv:2606.08078v1 Announce Type: new Abstract: Although low-bit quantization provides practical means to deploy speaker verification on resource-constrained devices, its effects on speaker verification performance remain poorly understood. In this paper, we study uniform K-means quantization-aware training of ResNet-36 and ResNet-200 through joint layer-wise and score-level analyses. Our layer-wise analysis highlights fragile components and shows that score degradation is not fully explained by weight distortion alone. We identify a clear knee point at 2 bits, with larger score drift and harmful decision flips concentrated near the FP32 threshold. Our score-level analysis reveals where and how score errors emerge under extreme quantization. Building on these findings, we propose a calibrated multi-precision cascade that resolves most trials at 2 bits and escalates only ambiguous cases, achieving performance close to FP32 while preserving the efficiency benefits of low-bit inference with substantially lower compute and memory costs.
Diagnostic and Mitigation arXiv:2606.08078v1 Announce Type: (ORG) ResNet-36 (ORG) ResNet-200 (ORG)
Originally published by arXiv CS Read original →