Home Knowledge Base Microscaling Formats

Microscaling Formats

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

MX-SAFE: Versatile Inference- and Training-Proof Microscaling Format with On-the-Fly Exponent and Mantissa Bit Allocation

arXiv:2605.24391v2 Announce Type: replace Abstract: As the demand for deep learning grows, cost reduction through quantization has become essential for both training and inference. In 2022, the Open Compute Project (OCP) consortium standardized narrow precision formats for deep learning, called the microscaling (MX) format. The MX format is a hardware-friendly dynamic quantization scheme that effectively reduces the data size by sharing an 8-bit exponent across multiple operands.

arXiv CS 7d ago

An 84-Format Numeric Catalog with Bit-Exact Conformance Vectors: A Vendor-Neutral Reference for FP8, BF16, MXFP4, and Microscaling Formats

arXiv:2606.09686v1 Announce Type: new Abstract: Numeric format proliferation in machine learning hardware -- FP8 (E4M3 and E5M2), BF16, MXFP4, microscaling block formats, and dozens of research variants -- has outpaced the availability of vendor-neutral, bit-exact reference material. Engineers porting models across accelerators encounter silent divergences that are difficult to diagnose without a shared ruler.

arXiv CS 1d ago

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second

From the first roaring racer of the combustion age to the sonic boom that shattered the sound barrier, humanity's hunger for speed is written into our very DNA. The speed of AI reasoning is no different — it defines the boundaries of intelligence itself. When a model is fast enough, it ceases to be a tool you wait on and becomes an extension of your own thinking: responding in real time, iterating in an instant, collaborating without friction.

Hacker News 2d ago

dMX: Differentiable Mixed-Precision Assignment for Low-Precision Floating-Point Formats

arXiv:2606.04115v1 Announce Type: new Abstract: Quantizing large language models (LLMs) to low-precision floating-point representations is central to efficient deployment, yet applying a single bit-width uniformly across all layers is sub-optimal in terms of both performance and accuracy. This work introduces dMX, a differentiable mixed-precision quantization framework for learnable floating-point bit-width assignment. We study its application for the microscaling floating-point (MXFP)...

arXiv CS 6d ago

Mutation-dependent responses to sleep and exercise in clonal haematopoiesis

Abstract Clonal haematopoiesis (CH) activates inflammation and increases the risk of atherosclerosis1,2. Whether lifestyle alters CH clone expansion or the phenotypic programming of CH mutant cells, thereby affecting atherosclerosis, is unknown. Here, in humans and mice and across mutations in Jak2, Tet2, Trp53 and Dnmt3a, we demonstrate mutation-dependent responses to sleep and exercise in CH and show that mutant cells are uniquely sensitive to lifestyle.

Nature 1d ago