Home Environment Assessing the Energy and Carbon Emissions of Neural...
Environment

Assessing the Energy and Carbon Emissions of Neural Speaker Verification Model in Training and Inference

Key Points

arXiv:2606.08087v1 Announce Type: new Abstract: Deep-learning speaker verification (SV) increasingly relies on deep neural network backbones, whose environmental impact remains largely undocumented. In this paper, we conduct an evaluation of ResNet architectures trained on VoxCeleb2, varying depth, channel width, and stage distribution, and measure energy consumption and carbon footprint using node-level sensors. Results show a clear point of diminishing returns: deeper or wider models bring...

arXiv:2606.08087v1 Announce Type: new Abstract: Deep-learning speaker verification (SV) increasingly relies on deep neural network backbones, whose environmental impact remains largely undocumented. In this paper, we conduct an evaluation of ResNet architectures trained on VoxCeleb2, varying depth, channel width, and stage distribution, and measure energy consumption and carbon footprint using node-level sensors. Results show a clear point of diminishing returns: deeper or wider models bring only marginal accuracy gains while energy consumption grows steeply. In contrast, mid-sized networks such as ResNet-50 and stage-concentrated variants achieve favorable trade-offs between performance and environmental impact. These findings provide actionable guidelines for designing energy-efficient SV systems.
the Energy and Carbon Emissions of Neural Speaker Verification Model (ORG) SV (ORG) ResNet (ORG) VoxCeleb2 (ORG)
Originally published by arXiv CS Read original →