Home Knowledge Base Bounded Hyperbolic Tanh

Bounded Hyperbolic Tanh

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Bounded Hyperbolic Tangent: A Stable and Efficient Alternative to Pre-Layer Normalization in Large Language Models

arXiv:2601.09719v3 Announce Type: replace Abstract: Pre-Layer Normalization (Pre-LN) is the de facto choice for large language models (LLMs) and is crucial for stable pretraining and effective transfer learning. However, Pre-LN incurs repeated statistical-computation overhead and remains vulnerable to the curse of depth, where hidden-state magnitudes and variances grow as the number of layers increases, destabilizing training. Efficiency-oriented normalization-free methods such as Dynamic...

arXiv CS 6d ago

One-Shot Klein Cutting Planes for Lipschitz Geodesically Convex Optimization in Hyperbolic Space

arXiv:2605.17540v4 Announce Type: replace Abstract: Motivated by the COLT 2023 open problem of Criscitiello, Mart\'inez-Rubio, and Boumal on deterministic first-order methods for Lipschitz geodesically convex optimization on Hadamard manifolds, we study hyperbolic space \[ \HH^d_{-\kappaC^2} =\{X\in\R^{d+1}:\ipL{X}{X}=-1,\ X_0>0\}, \qquad \ip{U}{V}_X=\kappaC^{-2}\ipL{U}{V}. For every geodesically convex $M$-Lipschitz function \[ f:\bar B_{\HH}(x_0,r)\to\R,\qquad s=\kappaC r, \] we give a...

arXiv CS 9d ago