MAGE: All-[MASK] Block Already Knows Where to Look in Block Diffusion LLM

arXiv CS Monday 08 June 2026, 04:00 UTC By Omin Kwon, Yeonjae Kim, Doyeon Kim, Minseo Kim, Yeonhong Park, Jae W. Lee 1 min read

Key Points

arXiv:2602.14209v2 Announce Type: replace Abstract: Block diffusion LLMs are an emerging paradigm for parallel language generation, but their KV caching makes memory access the dominant bottleneck in long-context inference. Sparse attention, which attends only to a small KV subset per query, can reduce this latency with minimal accuracy loss. In block diffusion, however, the B tokens of each block must share a single KV subset, and we show this per-block constraint degrades existing sparse KV estimators by up to 25% in recall. We address this challenge by exploiting a property that emerges from the block-diffusion training objective: it aligns the block-average query across denoising steps, so the All-[MASK] block at the first step already reveals the per-block KV subset for the entire trajectory. We exploit this in MAGE ([MASK]-Guided Sparse Attention), a training-free method that runs one exact attention pass at the first step and reuses its top-k index sets for all remaining steps within the block. Across three block-diffusion families on LongBench, MAGE matches Exact Attention at k=512 with near-lossless accuracy, achieves up to 6.82x end-to-end speedup at 128K context, and runs up to 3.35x and 2.28x faster than Quest and SparseD, designed for AR LLMs and fully bidirectional diffusion LLMs, respectively.

KV (ORG) LongBench (ORG)

Originally published by arXiv CS Read original →

Prof Kathy Willis responds to research showing that the poorest areas in the country face the deepest cuts to green spacesThe new research covered in your report (England’s poorest areas face deepest cuts to green space under planning law changes, report finds, 4 June) highlights the stark inequalities that exist across England when it comes to accessing nature-rich places and unlocking the many health, wellbeing and economic benefits that they can provide. In short, the research has found...

The Guardian UK 32m ago

The Last Evolution, by John W Campbell Jr. (1932)

The Project Gutenberg EBook of The Last Evolution, by John Wood Campbell This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.org

Hacker News 35m ago

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say In a proof-of-concept lab experiment, scientists demonstrated that intestinal parasites could make and release therapeutic agents inside a living host. Scientists genetically tweaked a tiny, worm-like parasite to produce a life-saving antitoxin from inside a living host. In a first-of-its-kind study, researchers modified the hookworm Ancylostoma ceylanicum so that it produces antibodies that...

Live Science 1h ago

Indonesia Landslides Devastated Endangered Orangutans, Study Finds

More than 5 percent of the species is estimated to have been lost when a climate-fueled storm unleashed torrents of water, mud and debris.

NYT Science 1h ago

MAGE: All-[MASK] Block Already Knows Where to Look in Block Diffusion LLM

Related Stories

Link between poverty and access to nature | Letter

The Last Evolution, by John W Campbell Jr. (1932)

Genetically modified worms can now produce and deliver drugs inside a living body, scientists say

Indonesia Landslides Devastated Endangered Orangutans, Study Finds