Home › Knowledge Base › Block

Block

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Towards Generalization of Block Attention via Automatic Segmentation and Block Distillation

arXiv:2605.15913v4 Announce Type: replace Abstract: Block attention, which processes the input as separate blocks that cannot attend to one another, offers significant potential to improve KV cache reuse in long-context scenarios such as Retrieval-Augmented Generation (RAG). However, its broader application is hindered by two key challenges: the difficulty of segmenting input text into meaningful, self-contained blocks, and the inefficiency of existing block fine-tuning methods that risk...

arXiv CS 5d ago

MAGE: All-[MASK] Block Already Knows Where to Look in Block Diffusion LLM

arXiv:2602.14209v2 Announce Type: replace Abstract: Block diffusion LLMs are an emerging paradigm for parallel language generation, but their KV caching makes memory access the dominant bottleneck in long-context inference. Sparse attention, which attends only to a small KV subset per query, can reduce this latency with minimal accuracy loss. In block diffusion, however, the B tokens of each block must share a single KV subset, and we show this per-block constraint degrades existing sparse...

arXiv CS 2d ago

Chatterbox-Flash: Prior-Calibrated Block Diffusion for Streaming Zero-Shot TTS

arXiv:2605.30748v2 Announce Type: replace Abstract: We present Chatterbox-Flash, a zero-shot text-to-speech model obtained by fine-tuning a pretrained autoregressive TTS decoder into a block-diffusion decoder, enabling parallel token generation within each block while retaining block-by-block streaming. We find that naively transferring mainstream block-diffusion decoding to discrete speech tokens degrades quality, as a long-tail token distribution biases parallel position selection toward a...

arXiv CS 8d ago

Chatterbox-Flash: Prior-Calibrated Block Diffusion for Streaming Zero-Shot TTS

Announce Type: new Abstract: We present Chatterbox-Flash, a zero-shot text-to-speech model obtained by fine-tuning a pretrained autoregressive TTS decoder into a block-diffusion decoder, enabling parallel token generation within each block while retaining block-by-block streaming. We find that naively transferring mainstream block-diffusion decoding to discrete speech tokens degrades quality, as a long-tail token distribution biases parallel position selection toward a few high-frequency...

arXiv CS 9d ago

Family of three died in 400ft plunge from luxury London apartment block

Family of three died in 400ft plunge from luxury London apartment block A mother, father and their terminally ill nine-year-old son died after falling from the 36th floor of an apartment block in south London on May 27 A mother, father and their terminally ill nine-year-old son tragically died after falling from an apartment block in what police believe was a suicide. Aditi, Rakesh and their son Sid had been living on the 36th floor of the 45-storey UNCLE tower block in Elephant and Castle,...

Daily Mirror 1d ago

WAV: Multi-Resolution Block Residual Routing for Deep Decoder-Only Transformers

Announce Type: new Abstract: Residual connections are central to training deep Transformers, but standard PreNorm residual streams aggregate sublayer updates with fixed unit weights. Recent Attention Residuals replace this fixed accumulation with content-dependent depth-wise routing, and Block Attention Residuals make the mechanism efficient by routing over block-level residual summaries. However, a single block summary stores only the low-frequency total residual displacement inside a...

arXiv CS 2d ago

SemBlock: Semantic Boundary Dynamic Blocks for Diffusion LLMs

arXiv:2606.04964v1 Announce Type: new Abstract: Diffusion language models (DLMs) generate text through iterative denoising, and blockwise decoding improves their practicality by committing tokens in local blocks. However, existing blockwise methods typically rely on fixed block sizes or delimiter-based runtime signals, which do not necessarily align with semantic boundaries. In this paper, we propose SemBlock, a semantic-boundary-driven dynamic block decoding framework for diffusion LLMs.

arXiv CS 6d ago

MixFP4: Enhancing NVFP4 with Adaptive FP4/INT4 Block Representations

Announce Type: new Abstract: As large language models continue to scale, fine-grained block-scaled low-precision formats such as NVFP4 are increasingly adopted for their substantial throughput and memory benefits. However, a single FP4 micro-format often mismatches heterogeneous block-level tensor statistics. To address this without changing the standard block-scaled MMA/GEMM execution path, we propose MixFP4, a mixed micro-format extension to NVFP4 that selects between two stored FP4...

arXiv CS 9d ago

T$^\star$: Progressive Block Scaling for Masked Diffusion Language Models Through Trajectory Aware Reinforcement Learning

arXiv:2601.11214v5 Announce Type: replace Abstract: We present T$^\star$, a simple TraceRL-based training curriculum for progressive block-size scaling in masked diffusion language models (MDMs). Starting from an AR-initialized small-block MDM, T$^\star$ transitions smoothly to larger blocks, enabling higher-parallelism decoding with minimal performance degradation on math reasoning benchmarks. Moreover, further analysis suggests that T$^\star$ may actually converge to an alternative...

arXiv CS 6d ago

Protests over US Ebola centre in Kenya kill 2, court keeps block on site

Protests over US Ebola centre in Kenya kill 2, court keeps block on site A Kenyan court blocked the proposed US Ebola quarantine centre for another three weeks A Kenyan court blocked on Tuesday for another three weeks a proposed US Ebola quarantine facility that has triggered protests killing two people and ordered the government to disclose its agreement with Washington. The proposed 50-bed unit on an air force base in central Kenya for Americans exposed to the virus in Democratic Republic...

South China Morning Post 8d ago