Home › Knowledge Base › The Floating-Point Multiplier

The Floating-Point Multiplier

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Accuracy-Configurable Floating-Point Multiplier Design for SRAM-Based Compute-in-Memory

arXiv:2606.08430v1 Announce Type: new Abstract: Digital Compute-in-Memory (DCiM) reduces data movement and has become a promising solution for energy-efficient edge AI. However, most existing DCiM frameworks still primarily target integer or fixed-point arithmetic, and provide limited support for compiler-integrated and accuracy-configurable floating-point computation. Directly integrating conventional IEEE 754 floating-point units into dense SRAM-based DCiM arrays, however, incurs high area...

arXiv CS 1d ago

GoldenFloat: A Phi-Derived Static-Split Floating-Point Family from GF4 to GF256 with a Lucas-Exact Integer Identity

arXiv:2606.05017v1 Announce Type: new Abstract: We present a hardware-oriented description of GoldenFloat (GF), a static-split floating-point family generated by a single closed rule, and three concrete artefacts: (i) an open multi-width RTL generator covering GF4-GF256 with a continuous-integration differential sweep against a correctly-rounded reference; (ii) an integer-backed Lucas-exact accumulator path verified at 500-digit precision for n = 1, ..., 256; and (iii) a GF16 FPGA codec...

arXiv CS 6d ago

O-POPE: High-Frequency Pipelined Outer Product based GEMM acceleration with minimal buffering overhead

arXiv:2606.02333v1 Announce Type: new Abstract: General matrix multiply (GEMM) dominates both execution time and energy consumption of modern machine learning (ML) workloads, placing increasing pressure on hardware efficiency. While quantization mitigates computational and data movement costs, accuracy-sensitive tasks such as training still require higher-precision floating-point formats. Existing floating-point GEMM accelerators face trade-offs between operating frequency, arithmetic...

arXiv CS 8d ago

Investigating Energy Bounds of Analog Compute-in-Memory with Local Normalization

arXiv:2602.08081v2 Announce Type: replace Abstract: Modern edge AI workloads demand maximum energy efficiency, motivating the pursuit of analog Compute-in-Memory (CIM) architectures. Simultaneously, the popularity of Large-Language-Models (LLMs) drives the adoption of low-bit floating-point formats which prioritize dynamic range. However, the conventional direct-accumulation CIM accommodates floating-points by normalizing them to a shared widened fixed-point scale.

arXiv CS 1d ago

Should you normalize RGB values by 255 or 256?

June 1st, 2026 Let’s say you’re writing an image processing program. The program takes in an image, converts it to floating point, does some processing and finally saves the modified pixels to disk as 8-bit colors. The question today concerns how exactly the integer-to-float conversion should be done.

Hacker News 9d ago

"They're made out of weights"

They're Made Out of Weights with apologies to Terry Bisson After Terry Bisson's "They're Made Out of Meat". "They're made out of weights." Floating-point numbers.

Hacker News 6d ago

Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks

Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks This post is a high-level explainer for my Master’s thesis, which involves designing hardware architectures for ultrafast inference and online learning using the Kolmogorov-Arnold Network (KAN) architecture. I’ll assume familiarity with standard machine learning concepts, as well as some understanding of hardware and digital circuits; read my previous post here for the latter. Please read the two papers below for more...

Hacker News 23h ago

RTLScout: Joint Agentic Code and Synthesis Optimization for Efficient Digital Circuits

Announce Type: new Abstract: We present RTLScout, an autonomous system that combines LLM-driven agentic design with circuit-level synthesis optimization and arithmetic architecture sweeps. An LLM agent iteratively writes, evaluates, and refines RTL designs using tool calls, guided by quantitative PPA (power, performance, area) feedback from Yosys and OpenROAD. We introduce a multi-run elite pool framework, where the best designs and lessons learned seed subsequent agent runs.

arXiv CS 2d ago