Tanh
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Indonesia’s Mount Merapi volcano erupts, spewing ash into the sky
Indonesia’s Mount Merapi volcano erupts, spewing ash into the sky Indonesia’s Mount Merapi volcano erupts, spewing ash into the sky Videos show Indonesia’s Mount Merapi spewing a column of ash around 2 kilometres high in West Sumatra’s Tanh Datar District. Authorities have enforced an “exclusion zone” within a 3-kilometre radius around Mount Merapi since an eruption in 2023. Published On 30 May 2026
Bounded Hyperbolic Tangent: A Stable and Efficient Alternative to Pre-Layer Normalization in Large Language Models
arXiv:2601.09719v3 Announce Type: replace Abstract: Pre-Layer Normalization (Pre-LN) is the de facto choice for large language models (LLMs) and is crucial for stable pretraining and effective transfer learning. However, Pre-LN incurs repeated statistical-computation overhead and remains vulnerable to the curse of depth, where hidden-state magnitudes and variances grow as the number of layers increases, destabilizing training. Efficiency-oriented normalization-free methods such as Dynamic...
Taming the Loss Landscape of PINNs with Noisy Feynman-Kac Supervision: Operator Preconditioning and Non-Asymptotic Error Bounds
arXiv:2606.00643v1 Announce Type: cross Abstract: Physics-Informed Neural Networks (PINNs) often train slowly or fail to converge on challenging partial differential equations (PDEs), a behavior recently linked to severely ill-conditioned loss landscapes inherited from the underlying differential operator. We study PINNs augmented with a pointwise data-fidelity term, added at a few points in the domain to the standard residual and boundary losses. We show that this supervision term acts as...
Energy-Efficient Implementation of Spiking Recurrent Cells on FPGA
arXiv:2605.10679v3 Announce Type: replace Abstract: Spiking Neural Networks (SNNs) can reduce energy consumption compared to conventional Artificial Neural Networks (ANNs) when spiking activity is sparse and the neuron model is hardware-friendly. However, biologically faithful models are often too costly to implement on FPGAs, whereas very simple models (e.g., IR/LIF) sacrifice part of the neuronal dynamics. In this work, we present an FPGA accelerator for an SNN using Spiking Recurrent Cell...
SmartMixed: A Two-Phase Training Strategy for Adaptive Activation Function Learning in Neural Networks
Announce Type: replace Abstract: The choice of activation function plays a critical role in neural networks, yet most architectures still rely on fixed, uniform activation functions across all neurons. We introduce SmartMixed, a novel two-phase training strategy that allows networks to learn optimal per-neuron activation functions while preserving computational efficiency at inference. In the first phase, neurons adaptively select from a pool of candidate activation functions (ReLU, Sigmoid,...
One-Shot Klein Cutting Planes for Lipschitz Geodesically Convex Optimization in Hyperbolic Space
arXiv:2605.17540v4 Announce Type: replace Abstract: Motivated by the COLT 2023 open problem of Criscitiello, Mart\'inez-Rubio, and Boumal on deterministic first-order methods for Lipschitz geodesically convex optimization on Hadamard manifolds, we study hyperbolic space \[ \HH^d_{-\kappaC^2} =\{X\in\R^{d+1}:\ipL{X}{X}=-1,\ X_0>0\}, \qquad \ip{U}{V}_X=\kappaC^{-2}\ipL{U}{V}. For every geodesically convex $M$-Lipschitz function \[ f:\bar B_{\HH}(x_0,r)\to\R,\qquad s=\kappaC r, \] we give a...
Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks
Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks This post is a high-level explainer for my Master’s thesis, which involves designing hardware architectures for ultrafast inference and online learning using the Kolmogorov-Arnold Network (KAN) architecture. I’ll assume familiarity with standard machine learning concepts, as well as some understanding of hardware and digital circuits; read my previous post here for the latter. Please read the two papers below for more...
Neural Spectral Element Methods for stiff multiphysics PDEs with electrochemical transport benchmarks
arXiv:2606.02335v1 Announce Type: cross Abstract: The Neural Spectral Element Method (NSEM) evaluates each network only at fixed Legendre-Gauss-Lobatto quadrature nodes and replaces all derivative calls with precomputed spectral differentiation matrices. The resulting deterministic loss enables limited-memory BFGS (L-BFGS) to reach residuals of 10^-9 to 10^-10. A Kosloff-Tal-Ezer coordinate map resolves electrochemical boundary layers, while a mesh-free neural mortar framework couples...
Physics-Informed Neural Network Modeling of Biodegradable Contaminant Transport through GCL/SL Composite Liners
arXiv:2606.04392v1 Announce Type: new Abstract: This study develops a two-domain physics-informed neural network framework for contaminant transport through a GCL/SL composite liner system, in which the thin GCL layer is treated using a steady-state advection-dispersion-biodegradation formulation and the underlying soil liner is modeled as a transient transport domain. Two formulations are evaluated against analytical and finite-element reference solutions under different leachate-head...