Home Knowledge Base Codec

Codec

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

HybridCodec: Fast Dual-Stream, Semantically Enhanced Neural Audio Codec

arXiv:2606.06743v1 Announce Type: new Abstract: The popularity of neural audio codecs as speech tokenizers has surged with the advent of Multimodal Large Language Models. New codec architectures with semantic and acoustic disentanglement have emerged. There are two main approaches to introduce semantic information into codec models: one distills semantic information from SSL representations into the first RVQ layer, while the other maintains separate streams for semantic and acoustic features.

arXiv CS 2d ago

CodecFake+: Codec-Based Resynthesized Data as a Proxy for Detecting CodecFake Speech

Announce Type: replace Abstract: With the rapid advancement of neural audio codecs, codec-based speech generation (CoSG) systems have become highly powerful. Unfortunately, CoSG also enables the creation of highly realistic deepfake speech, making it easier to mimic an individual's voice and spread misinformation. We refer to this emerging deepfake speech generated by CoSG systems as CodecFake.

arXiv CS 1d ago

Burn-9 makes an entire game out of Metal Gear's Codec scenes

Burn-9 makes an entire game out of Metal Gear's Codec scenes Plus, check out the new trailer for the cosmic horror game Penguin Colony. "Do you think love can bloom, even on a battlefield?" It's one of many questions the Metal Gear series ask players to consider across its many cutscenes and Codec calls.

Engadget 4d ago

Spatial Artifact Coherence Determines Codec Robustness in Patch-Based rPPG

arXiv:2606.04198v1 Announce Type: new Abstract: Remote photoplethysmography (rPPG) achieves low heart-rate error on uncompressed benchmarks yet is deployed over compressed video channels in telehealth, neonatal ICU, and driver fatigue applications. No prior work identifies the physical quantity determining when spatial decomposition outperforms global-projection methods under codec compression. We propose Spatial Artifact Coherence (SAC), defined as the ratio of off-diagonal to diagonal...

arXiv CS 6d ago

LLMCodec: Adapting Video Codecs for Efficient Weight Compression of Large Language Models

Announce Type: new Abstract: The rapid development of large language models(LLMs) has led to remarkable advances in natural language processing. However, the increasing scale of these models introduces substantial challenges in terms of storage, transmission, and deployment. Though great efforts have been devoted to model compression and quantization, existing methods often rely on fine-tuning or calibration data, which exhibit limited generalization across different tensor types.

arXiv CS 5d ago

Dav2d

Let dav2d be dav2d A codec does not really exist until everyone can decode it. Today, we announce dav2d, a fast decoder for the new AV2 codec, developed by members of the VideoLAN community. A few weeks ago, we opened the repository and started development in public.

Hacker News 10d ago

CleanCodec: Efficient and Robust Speech Tokenization via Perceptually Guided Encoding

Announce Type: new Abstract: Neural audio codecs are a key component of speech processing pipelines, compressing audio into discrete tokens for downstream modeling. However, existing codecs struggle to balance reconstruction quality with token efficiency, often encoding perceptually irrelevant information such as background noise and recording artifacts at the expense of linguistically and acoustically meaningful content. We reframe audio tokenization as a selective information bottleneck...

arXiv CS 6d ago

GOPAgen: Motion-Aware and Efficient Agentic Long-Video Understanding with Structural Memory and Hierarchical Reasoning

Announce Type: new Abstract: Despite significant progress in agentic long video understanding, existing methods still lack detailed motion comprehension coupled with an efficient memory architecture. In this paper, we propose GOPAgen, a novel approach that first integrates video codec into the video understanding framework via a meticulously designed motion agent trained on Groups of Pictures (GOPs) from video codec. We further develop a GOP tree reasoning algorithm, which is naturally...

arXiv CS 2d ago

Mitigating Proxy-to-Wild Domain Gap in Deepfake Speech

arXiv:2606.07494v1 Announce Type: new Abstract: Recent neural audio codec-based speech generation (CodecFake) produces highly realistic audio, posing a challenge to existing deepfake countermeasure models. While using codec resynthesized speech (CoRS) as proxy data improves performance, it often suffers from limited generalization.

arXiv CS 2d ago