The Floating-Point Multiplier
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Accuracy-Configurable Floating-Point Multiplier Design for SRAM-Based Compute-in-Memory
arXiv:2606.08430v1 Announce Type: new Abstract: Digital Compute-in-Memory (DCiM) reduces data movement and has become a promising solution for energy-efficient edge AI. However, most existing DCiM frameworks still primarily target integer or fixed-point arithmetic, and provide limited support for compiler-integrated and accuracy-configurable floating-point computation. Directly integrating conventional IEEE 754 floating-point units into dense SRAM-based DCiM arrays, however, incurs high area...
GoldenFloat: A Phi-Derived Static-Split Floating-Point Family from GF4 to GF256 with a Lucas-Exact Integer Identity
arXiv:2606.05017v1 Announce Type: new Abstract: We present a hardware-oriented description of GoldenFloat (GF), a static-split floating-point family generated by a single closed rule, and three concrete artefacts: (i) an open multi-width RTL generator covering GF4-GF256 with a continuous-integration differential sweep against a correctly-rounded reference; (ii) an integer-backed Lucas-exact accumulator path verified at 500-digit precision for n = 1, ..., 256; and (iii) a GF16 FPGA codec...
O-POPE: High-Frequency Pipelined Outer Product based GEMM acceleration with minimal buffering overhead
arXiv:2606.02333v1 Announce Type: new Abstract: General matrix multiply (GEMM) dominates both execution time and energy consumption of modern machine learning (ML) workloads, placing increasing pressure on hardware efficiency. While quantization mitigates computational and data movement costs, accuracy-sensitive tasks such as training still require higher-precision floating-point formats. Existing floating-point GEMM accelerators face trade-offs between operating frequency, arithmetic...
Investigating Energy Bounds of Analog Compute-in-Memory with Local Normalization
arXiv:2602.08081v2 Announce Type: replace Abstract: Modern edge AI workloads demand maximum energy efficiency, motivating the pursuit of analog Compute-in-Memory (CIM) architectures. Simultaneously, the popularity of Large-Language-Models (LLMs) drives the adoption of low-bit floating-point formats which prioritize dynamic range. However, the conventional direct-accumulation CIM accommodates floating-points by normalizing them to a shared widened fixed-point scale.
Should you normalize RGB values by 255 or 256?
June 1st, 2026 Let’s say you’re writing an image processing program. The program takes in an image, converts it to floating point, does some processing and finally saves the modified pixels to disk as 8-bit colors. The question today concerns how exactly the integer-to-float conversion should be done.
"They're made out of weights"
They're Made Out of Weights with apologies to Terry Bisson After Terry Bisson's "They're Made Out of Meat". "They're made out of weights." Floating-point numbers.
Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks
Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks This post is a high-level explainer for my Master’s thesis, which involves designing hardware architectures for ultrafast inference and online learning using the Kolmogorov-Arnold Network (KAN) architecture. I’ll assume familiarity with standard machine learning concepts, as well as some understanding of hardware and digital circuits; read my previous post here for the latter. Please read the two papers below for more...
RTLScout: Joint Agentic Code and Synthesis Optimization for Efficient Digital Circuits
Announce Type: new Abstract: We present RTLScout, an autonomous system that combines LLM-driven agentic design with circuit-level synthesis optimization and arithmetic architecture sweeps. An LLM agent iteratively writes, evaluates, and refines RTL designs using tool calls, guided by quantitative PPA (power, performance, area) feedback from Yosys and OpenROAD. We introduce a multi-run elite pool framework, where the best designs and lessons learned seed subsequent agent runs.