Semantic-Acoustic Tokenization
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
EntangleCodec: A Unified Discrete Audio Tokenizer via Semantic-Acoustic Entanglement
new Abstract: Audio tokenizers serve as the discrete interface between continuous audio and Audio Language Models (ALMs), but existing tokenizers often struggle to support both understanding and generation. Reconstruction-oriented codecs preserve acoustic fidelity but lack rich semantics, while semantic-aware tokenizers typically rely on separate semantic and acoustic streams, introducing redundancy or misalignment. We propose \textbf{EntangleCodec}, a unified discrete audio tokenizer that...
DSA-Tokenizer: Disentangled Semantic-Acoustic Tokenization via Flow Matching-based Hierarchical Fusion
arXiv:2601.09239v5 Announce Type: replace Abstract: Speech tokenizers are a key building block of fully discrete Speech LLMs. Existing tokenizers either prioritize semantic encoding, fuse semantic content with acoustic style inseparably,or achieve incomplete semantic-acoustic disentanglement.
UniAudio-Token: Empowering Semantic Speech Tokenizers with General Audio Perception
arXiv:2605.31521v1 Announce Type: new Abstract: Semantic speech tokenizers have become a widely used interface for Audio-LLMs, owing to their compact single-codebook design and strong linguistic alignment. However, their focus on linguistic abstraction induces acoustic blindness, limiting their applicability beyond speech-centric tasks. We propose UniAudio-Token, a framework that empowers semantic tokenizers with general audio perception without compromising speech ability.