Home Knowledge Base Whisper

Whisper

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Whisper Hallucination Detection and Mitigation via Hidden Representation Steering and Sparse AutoEncoders

arXiv:2606.07473v1 Announce Type: new Abstract: Whisper, a widely adopted ASR model, is known to suffer from hallucinations - coherent transcriptions generated for non-speech audio entirely disconnected from the input. We investigate whether hallucinations can be detected and mitigated through Whisper's internal representations. We extract audio encoder activations and evaluate two representation spaces: raw Whisper activations and Sparse AutoEncoder (SAE) latents.

arXiv CS 2d ago

Jill Biden reveals what husband Joe whispered to her after Trump debate disaster in new book

Jill Biden reveals what husband Joe whispered to her after Trump debate disaster in new book Former US First Lady Jill Biden has revealed exactly what happened once the cameras stopped rolling on Joe's disaster TV debate with Donald Trump in her new memoir Jill Biden has revealed what her husband Joe said to her in the moments after the disastrous TV debate with Donald Trump. The former US president's bumbling performance against his Republican rival in June 2024 - now considered to be the...

Daily Mirror 7d ago

Mapping Whisper Representations to Human ECoG Responses with Interpretable Time-Resolved Neural Encoding

cross Abstract: Understanding how speech foundation models relate to human cortical activity is a key challenge for computational neuroscience. Here, we investigate how internal representations from Whisper predict intracranial ECoG responses during naturalistic speech perception. We introduce a time-resolved neural encoder that combines speech embeddings with a recurrent temporal model and soft attention, allowing us to examine layer-wise brain alignment.

arXiv CS 8d ago

BaltiVoice: A Speech Corpus and Fine-tuned Whisper ASR System for the Balti Language

arXiv:2606.03504v1 Announce Type: new Abstract: We present BaltiVoice, a 16.8-hour read-speech corpus for Balti (ISO 639-3: bft), a Tibetic language spoken in Gilgit-Baltistan, Pakistan, with no prior publicly available ASR resources. The corpus contains 10,060 validated utterances in native Nastaliq script, derived from Mozilla Common Voice recordings. We fine-tune OpenAI Whisper-small on this corpus and report a Word Error Rate (WER) of 30.07% on a held-out validation set of 538...

arXiv CS 7d ago

Overcoming Decoder Inconsistencies in Whisper for Dravidian and Low-Resource Languages

Announce Type: new Abstract: Multilingual ASR models such as Whisper perform well on high-resource languages but exhibit substantially higher Word Error Rates (WER) for Dravidian languages compared to Indo-Aryan ones. Through linguistic and dataset analysis, we show that Dravidian languages have longer words, higher vocabulary diversity, and lower repetition, resulting in sparse token distributions and frequent character-level substitution errors. Baseline fine-tuning further reveals decoder...

arXiv CS 1d ago

CoughSense: Five-Class Respiratory Disease Classification via Whisper Encoder Fine-Tuning and Dual-Encoder Cross-Attention Fusion with Balanced Contrastive Learning

arXiv:2606.02998v1 Announce Type: new Abstract: Automated cough analysis offers a path to low-cost respiratory screening, but most existing work stops at binary COVID-19 detection. A practical tool needs to tell apart several respiratory conditions from one cough recording on a consumer smartphone. We present CoughSense, a system that sorts cough recordings into five classes.

arXiv CS 7d ago

ASKD-Whisper: Adaptive Self-knowledge Distillation for Efficient and Low-Latency Automatic Speech Recognition

arXiv:2601.19919v2 Announce Type: replace Abstract: Knowledge distillation (KD) is one of the most effective paradigms for compressing large-scale foundation models into deployable architectures. In the context of Automatic Speech Recognition (ASR), previous studies have predominantly focused on forcing the student model to strictly mimic the predictive distribution of a massive teacher model. However, this static dependency often presents an inherent trade-off: while the student rapidly...

arXiv CS 8d ago

After 11 years at Mars, NASA's MAVEN spacecraft went out with a whisper

NASA's MAVEN spacecraft was in excellent shape when it disappeared behind Mars on December 6 of last year. The routine passage, called an occultation, was supposed to last less than an hour, but ground teams didn't hear from the spacecraft when it was supposed to regain contact with Earth. The loss of communication triggered contingency plans for engineers to try to restore a link with MAVEN, which orbits Mars more than 200 million miles from Earth.

Ars Technica Science 6d ago