Home › Knowledge Base › Log-Mel

Log-Mel

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Diffusion-Based Heart Sound Generation: Evaluation with Physiological Signal Metrics, Classifiers, and Expert Listening

arXiv:2606.02448v1 Announce Type: cross Abstract: Publicly available phonocardiogram (PCG) datasets remain limited in size and pathological diversity, constraining both auscultation training and the generalisation of automated heart-sound classifiers. A class-conditional diffusion model for PCG generation is developed in the log-mel domain and synthetic fidelity is assessed using complementary (i) physiology-inspired plausibility metrics, (ii) downstream label-consistency evaluation, and...

arXiv CS 8d ago

Representation Matters in Randomized Smoothing for Audio Classification

arXiv:2606.04210v1 Announce Type: cross Abstract: Randomized smoothing (RS) certifies robustness in the vector space where Gaussian noise is added. In audio classification, this space is often not uniquely defined as standard pipelines normalize, range-control, and transform waveforms into log-mel or other spectral features. We show that direct RS is therefore under-specified unless the certified object and preprocessing policy are explicit.

arXiv CS 6d ago

Multi-View Speech Representation Learning for Parkinson's Disease Detection Using Context-guided Cross-modal Attention

arXiv:2606.09271v1 Announce Type: new Abstract: Parkinson's disease (PD) is a progressive neurodegenerative disorder that frequently causes speech impairments associated with hypokinetic dysarthria. As speech production relies on the precise coordination of complex neuromuscular mechanisms, speech analysis has emerged as a promising non-invasive and cost-effective biomarker for early PD detection. Recent deep learning approaches have shown encouraging results; however, most existing methods...

arXiv CS 1d ago

A Training-Efficient Transformer-Based Anti-Spoofing Network for Logical Access in ASVspoof 5

arXiv:2606.02980v1 Announce Type: new Abstract: Synthetic and manipulated speech can reduce the reliability of automatic speaker verification systems, so anti-spoofing methods need to be both accurate and efficient in training and inference. This paper focuses on the ASVspoof 5 Track 1 closed condition, where standard cross-entropy training may not give enough attention to hard trials and is not directly aligned with ranking- and threshold-based evaluation metrics. We propose TFPARN, a...

arXiv CS 7d ago