Home Knowledge Base Decoder

Decoder

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Block-Based Double Decoders

Announce Type: replace Abstract: Encoder-decoder models offer substantial inference-time savings over decoder-only models, but their pretraining objectives suffer from sparse supervision and dynamic sequence lengths, keeping them out of practice at scale. We propose block-based double decoders, a novel transformer architecture that utilizes doubly-causal block-based attention masks to train with full loss supervision and static sequence packing, combining decoder-only training efficiency...

arXiv CS 9d ago

Variational Speculative Decoding: Rethinking Draft Training from Token Likelihood to Sequence Acceptance

arXiv:2602.05774v4 Announce Type: replace Abstract: Speculative decoding accelerates inference for (M)LLMs, yet a training-decoding discrepancy persists: while existing methods optimize single greedy trajectories, decoding involves verifying and ranking multiple sampled draft paths. We propose Variational Speculative Decoding (VSD), formulating draft training as variational inference over latent proposals (draft paths). VSD maximizes the marginal probability of target-model acceptance,...

arXiv CS 1d ago

SimSD: Simple Speculative Decoding in Diffusion Language Models

Announce Type: new Abstract: Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive (AR) LLMs, offering faster inference through parallel or blockwise decoding. However, their masked language modeling formulation remains incompatible with standard token-level speculative decoding, one of the most effective acceleration techniques for AR models. In AR decoding, the causal mask preserves temporally valid token-level contexts, enabling a...

arXiv CS 8d ago

Neural decoding of speech using deep neural ensembles

Speech brain-computer interfaces (BCIs) can restore rapid communication to people with paralysis, but decoding errors still limit performance. In recent brain-to-text decoding competitions, deep ensemble methods, which combine predictions from multiple independently trained decoders, have delivered striking accuracy improvements and account for the largest gains over baseline approaches. However, these methods have not previously been tested in real-time, require substantial computational...

bioRxiv 6d ago

Fast-dLLM++: Fr\'{e}chet Profile Decoding for Faster Diffusion LLM Inference

Announce Type: new Abstract: Diffusion large language models promise parallel token generation, yet inference remains bottlenecked by deciding which masked tokens can be safely committed together. Fast-dLLM addressed this with KV caching and confidence-guided parallel decoding, but its decoding theory uses a homogeneous high-confidence assumption that effectively reduces each candidate set to its weakest selected token. We argue that this leaves speed on the table because real decoding steps...

arXiv CS 7d ago

How Much Parallelism Is "Free"? A Principle of Near-Free Parallelism for Parallel Decoding

arXiv:2605.30851v1 Announce Type: new Abstract: Parallel decoding improves generation efficiency by processing multiple decode positions within a single decode forward, but reported speedups conflate algorithmic token utilization with the system cost of executing multiple positions. We isolate the system side by introducing Near-Free Parallelism (NFP), the maximum number of positions executable at near-free latency. Analyzing Dense FFNs, MoE FFNs, and Attention against an idle-compute...

arXiv CS 9d ago

Breaking the Cascade: Compact Nonlinear Optical Computing with Single-Layer Encoder-Decoder Co-Localization

Announce Type: new Abstract: We demonstrate that nonlinear computing can be achieved with a single linear diffractive surface under coherent illumination. We introduce a compact encoder-decoder co-localization (E+D) architecture in which an input-dependent dynamic encoder and a static optimized decoder are integrated within the same phase-only diffractive plane. Following free-space propagation, coherent interference between the encoder and decoder fields, combined with intensity detection,...

arXiv Physics 8d ago

DAPD: Dependency-Aware Parallel Decoding via Attention for Diffusion LLMs

arXiv:2603.12996v2 Announce Type: replace Abstract: Parallel decoding for Diffusion LLMs (dLLMs) is difficult because each denoising step provides only token-wise marginal distributions, while unmasking multiple tokens simultaneously requires accounting for inter-token dependencies. We propose Dependency-Aware Parallel Decoding (DAPD), a simple, training-free decoding method that uses self-attention to induce a conditional dependency graph over masked tokens. At each iteration, edges in this...

arXiv CS 8d ago

Breaking the Cascade: Compact Nonlinear Optical Computing with Single-Layer Encoder-Decoder Co-Localization

Announce Type: cross Abstract: We demonstrate that nonlinear computing can be achieved with a single linear diffractive surface under coherent illumination. We introduce a compact encoder-decoder co-localization (E+D) architecture in which an input-dependent dynamic encoder and a static optimized decoder are integrated within the same phase-only diffractive plane. Following free-space propagation, coherent interference between the encoder and decoder fields, combined with intensity...

arXiv CS 8d ago

Continuous Language Diffusion as a Decoder-Interface Problem

arXiv:2606.08810v1 Announce Type: new Abstract: Gaussian-corrupted sentence embeddings have no direct linguistic interpretation, yet continuous diffusion language models can generate fluent text from them. We study this puzzle through Embedded Language Flows (ELF) and identify a decoder-basin mechanism: denoising succeeds when trajectories reach regions where the native decoder can read stable tokens. We introduce a diagnostic protocol for denoisability, semantic recoverability, order...

arXiv CS 1d ago