HuggingFace
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis
arXiv:2606.09682v1 Announce Type: new Abstract: AutoMegaKernel (AMK) compiles a HuggingFace Llama-family model into a single persistent cooperative CUDA kernel that runs the whole forward pass in one launch, with no per-model hand-written CUDA. The contribution is the system, not raw speed. A frozen schedule-IR validator statically certifies deadlock-freedom and race-freedom via static graph checks (not a mechanized proof), so an unsafe agent-proposed schedule is rejected before launch:...
Interpreto: An Explainability Library for Transformers
arXiv:2512.09730v3 Announce Type: replace Abstract: Interpreto is an open-source Python library for interpreting HuggingFace language models, from early BERT variants to LLMs. It provides two complementary families of methods: attribution methods and concept-based explanations. The library bridges recent research and practical tooling by exposing explanation workflows through a unified API for both classification and text generation.
Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines
arXiv:2605.25645v2 Announce Type: replace Abstract: We present the first end-to-end demonstration of fine-tuning and serving Google's Gemma 4 31B model on TPU hardware, providing an empirical comparison of TPU and GPU platforms for large language model adaptation. Using LoRA on a Google TPU v5p-8 for training and TPU v6e-8 (Trillium) for inference, we document the full set of code-level adaptations required to port a GPU-native training recipe, built on PyTorch, HuggingFace TRL, and FSDP, to...
Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines
Announce Type: replace Abstract: We present the first end-to-end demonstration of fine-tuning and serving Google's Gemma 4 31B model on TPU hardware, providing an empirical comparison of TPU and GPU platforms for large language model adaptation. Using LoRA on a Google TPU v5p-8 for training and TPU v6e-8 (Trillium) for inference, we document the full set of code-level adaptations required to port a GPU-native training recipe - built on PyTorch, HuggingFace TRL, and FSDP - to the JAX +...
A pictorial introduction to differential geometry (2017)
Differential Geometry [Submitted on 21 Sep 2017] Title:A pictorial introduction to differential geometry, leading to Maxwell's equations as three pictures View PDFAbstract:In this article we present pictorially the foundation of differential geometry which is a crucial tool for multiple areas of physics, notably general and special relativity, but also mechanics, thermodynamics and solving differential equations. As all the concepts are presented as pictures, there are no equations in this...
Unified Controllable and Faithful Text-to-CAD Generation with LLMs
Computer Science > Computation and Language [Submitted on 27 Mar 2026] Title:PR-CAD: Progressive Refinement for Unified Controllable and Faithful Text-to-CAD Generation with Large Language Models View PDF HTML (experimental)Abstract:The construction of CAD models has traditionally relied on labor-intensive manual operations and specialized expertise. Recent advances in large language models (LLMs) have inspired research into text-to-CAD generation.
Nonlinear Arithmetic with SMTLIB Division is Undecidable
Computer Science > Logic in Computer Science This paper has been withdrawn by Dejan Jovanovic [Submitted on 25 May 2026 (v1), last revised 2 Jun 2026 (this version, v2)] Title:Nonlinear Arithmetic with SMTLIB Division is Undecidable
The Grothendieck Constant is Less Than $\frac{\pi}{2 \log (1+ \sqrt{2})} - 10^{-5}$
Computer Science > Data Structures and Algorithms [Submitted on 2 Jun 2026] Title:The Grothendieck Constant is Less Than $\fracπ{2 \log (1+ \sqrt{2})} - 10^{-5}$ View PDF HTML (experimental)Abstract:We prove that the Grothendieck constant $K_G < $\frac{\pi}{2 \log (1+ \sqrt{2})} - 10^{-5}$. This improves on the work of braverman et.
Unsupervised Learning Based Focal Stack Camera Depth Estimation
Electrical Engineering and Systems Science > Image and Video Processing [Submitted on 14 Mar 2022 (v1), last revised 3 Jun 2026 (this version, v3)] Title:Unsupervised Learning Based Focal Stack Camera Depth Estimation View PDFAbstract:We propose an unsupervised deep learning based method to estimate depth from focal stack camera images. On the NYU-v2 dataset, our method achieves much better depth estimation accuracy compared to single-image based methods.
Variational free complement method with Gaussian-expanded complement functions: convergence with fixed Gaussian expansion length
Physics > Chemical Physics [Submitted on 1 Jun 2026] Title:Variational free complement method with Gaussian-expanded complement functions: convergence with fixed Gaussian expansion length View PDF HTML (experimental)Abstract:For the free complement theory with Gaussian-expanded complement functions, the energy convergence of $n_\mathrm{G} = \mathrm{constant} < \infty, n\rightarrow\infty$ is discussed, where $n_\mathrm{G}$ is the number of the Gaussian functions in the STO-$n$G expansion.