Home › Knowledge Base › Deepfake Speech

Deepfake Speech

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Mitigating Proxy-to-Wild Domain Gap in Deepfake Speech

arXiv:2606.07494v1 Announce Type: new Abstract: Recent neural audio codec-based speech generation (CodecFake) produces highly realistic audio, posing a challenge to existing deepfake countermeasure models. While using codec resynthesized speech (CoRS) as proxy data improves performance, it often suffers from limited generalization.

arXiv CS 2d ago

ExpSpeech-Net: Multimodal Fusion of Expression and Speech for Deepfake Detection

Announce Type: new Abstract: Deepfake videos are increasingly challenging the credibility of online content. Many existing detection methodology relies on complex, resource-intensive models, which limit their practical use. The study introduces the ExpSpeech-Net deepfake detection (SqN-R-DFD) model, which utilizes SqueezeNet and RNN (Recurrent Neural Network) as its backbone, providing a lightweight and efficient deepfake detection framework that simultaneously analyzes facial expressions...

arXiv CS 5d ago

AUDDT: A Unified Benchmark Toolkit for Audio and Speech Deepfake Detectors

arXiv:2509.21597v2 Announce Type: replace-cross Abstract: With the prevalence of artificial intelligence (AI)-generated content, such as audio deepfakes, a large body of recent work has focused on developing deepfake detection techniques. However, existing benchmarks employ a narrow set of datasets, leaving detector generalization to real-world conditions uncertain. In this paper, we systematically review 31 existing audio deepfake datasets and present an open-source benchmarking toolkit...

arXiv CS 6d ago

CodecFake+: Codec-Based Resynthesized Data as a Proxy for Detecting CodecFake Speech

Announce Type: replace Abstract: With the rapid advancement of neural audio codecs, codec-based speech generation (CoSG) systems have become highly powerful. Unfortunately, CoSG also enables the creation of highly realistic deepfake speech, making it easier to mimic an individual's voice and spread misinformation. We refer to this emerging deepfake speech generated by CoSG systems as CodecFake.

arXiv CS 1d ago

The First Environmental Sound Deepfake Detection Challenge: Benchmarking Robustness, Evaluation, and Insights

Announce Type: replace Abstract: Recent progress in audio generation has made it increasingly easy to create highly realistic environmental soundscapes, which can be misused to produce deceptive content, such as fake alarms, gunshots, and crowd sounds, raising concerns for public safety and trust. While deepfake detection for speech and singing voice has been extensively studied, environmental sound deepfake detection (ESDD) remains underexplored. To advance ESDD, the first edition of the...

arXiv CS 1d ago

FoeGlass: Simple In-Context Learning Is Enough for Red Teaming Audio Deepfake Detectors

arXiv:2606.05101v1 Announce Type: new Abstract: Audio deepfake detection (ADD) models are critical for countering the malicious use of text-to-speech (TTS) models. Evaluating and strengthening ADD models requires developing datasets that span the space of generated audio and highlight high-error regions.

arXiv CS 6d ago

Escaping the Linearity Trap: Manifold Detours for Black-Box Adversarial Attacks on Singing Audio Deepfake Detection

arXiv:2605.30366v1 Announce Type: new Abstract: Recent Singing Voice Synthesis (SVS) advances enable highly realistic but potentially malicious AI covers, making singing voice deepfake detection (SVDD) crucial. Self-Supervised Learning (SSL)-based detectors achieve state-of-the-art performance by fine-tuning speech SSL backbones to capture singing-specific spoof artifacts.

arXiv CS 9d ago

AI voice scams can clone your family’s voice

It's your son's voice. He says he's been in a car accident. He's about to be arrested.

Fox News Tech 1d ago

AI voice scams can clone your family’s voice

It's your son's voice. He says he's been in a car accident. He's about to be arrested.

Fox News 1d ago

BioLip: Language-Generalizable Lip-Sync Deepfake Detection via Biomechanical Constraint Violation Modeling

Announce Type: replace Abstract: Existing lip-sync deepfake detectors rely on pixel artifacts or audio-visual correspondence, and both fail under generator or language shift because the features they learn are tied to the training distribution. We take a different approach. Authentic lip motion is constrained by tissue mechanics and neuromuscular bandwidth; current generators typically do not impose these constraints, producing trajectories with elevated variance in velocity, acceleration,...

arXiv CS 7d ago