Home › Knowledge Base › Neural Token Reconstruction

Neural Token Reconstruction

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

NTR: Neural Token Reconstruction for Scene Token Bottleneck in End-to-End Driving

Announce Type: new Abstract: Recent perception-free end-to-end (E2E) autonomous driving methods bypass explicit perception outputs by compressing dense image patch tokens into compact scene tokens for downstream trajectory generation and scoring. While these scene tokens form a compact visual bottleneck for the planner, they receive supervision solely from the planning objective, providing limited constraints on the encoded visual information.

arXiv CS 9d ago

Neural Field Tokenizations with Hierarchy and Spatial Locality Priors

Announce Type: new Abstract: Neural fields parameterize data as functions from coordinates to values, providing a unified framework for representation learning across modalities. Existing approaches are dominated by per-sample meta-learning, which scales poorly due to memory-intensive inner-loop optimization. The natural alternative -- feed-forward encoding -- typically introduces modality-specific assumptions, sacrificing the generality that makes learning with neural fields attractive.

arXiv CS 1d ago

CleanCodec: Efficient and Robust Speech Tokenization via Perceptually Guided Encoding

Announce Type: new Abstract: Neural audio codecs are a key component of speech processing pipelines, compressing audio into discrete tokens for downstream modeling. However, existing codecs struggle to balance reconstruction quality with token efficiency, often encoding perceptually irrelevant information such as background noise and recording artifacts at the expense of linguistically and acoustically meaningful content. We reframe audio tokenization as a selective information bottleneck...

arXiv CS 6d ago

DSL-Topic: Improving Topic Modeling by Distilling Soft Labelsfrom Language Models

arXiv:2602.17907v2 Announce Type: replace Abstract: Traditional neural topic models are typically optimized by reconstructing the document's Bag-of-Words (BoW) representations, overlooking contextual information and struggling with data sparsity. In this work, we introduce a novel topic model training framework by Distilling Soft Labels (DSL) from Language Models (LMs). To construct the contextually enriched reconstruction signals, we project the next token probabilities, conditioned on a...

arXiv CS 6d ago

HybridCodec: Fast Dual-Stream, Semantically Enhanced Neural Audio Codec

arXiv:2606.06743v1 Announce Type: new Abstract: The popularity of neural audio codecs as speech tokenizers has surged with the advent of Multimodal Large Language Models. New codec architectures with semantic and acoustic disentanglement have emerged. There are two main approaches to introduce semantic information into codec models: one distills semantic information from SSL representations into the first RVQ layer, while the other maintains separate streams for semantic and acoustic features.

arXiv CS 2d ago

Adaptive Tokenisation Via Temporal Redundancy Masking And Latent Inpainting

arXiv:2606.06158v1 Announce Type: new Abstract: Adaptive video tokenisation seeks to dynamically allocate token budgets based on the underlying visual complexity of a sequence. Current continuous-regime approaches achieve this via iterative binarised searches or trained neural regressors, while discrete methods often require a full-rate decoder pass to estimate information content. We demonstrate that such computational overheads are not strictly necessary.

arXiv CS 5d ago

Channel-Oriented Design for EEG-to-Music Reconstruction

Announce Type: new Abstract: Brain-computer interfaces aim to decode naturalistic stimuli from neural signals, yet most progress to date has focused on vision and language. In this article, we study a more challenging but far less explored setting, EEG-to-music reconstruction, where signals are weak, distributed, and highly susceptible to noise and channel variability. Our central finding is that early channel mixing destroys weak but discriminative EEG signals.

arXiv CS 6d ago

Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning

arXiv:2508.06588v3 Announce Type: replace Abstract: Vector Quantization (VQ) has recently emerged as a promising approach for learning compressed and discrete representations for graph-structured data. However, a fundamental challenge, i.e., codebook collapse, remains underexplored in the graph domain, significantly limiting the expressiveness and generalization of graph tokens. In this paper, we present an empirical study and observe that codebook collapse consistently occurs when training...

arXiv CS 8d ago