Home Knowledge Base SWALU

SWALU

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

SaluNet: Enabling Total Plasticity in Normalization-Free Deep Networks

Announce Type: new Abstract: Normalization layers such as BatchNorm and LayerNorm have long been considered essential for stable training in deep networks. This work demonstrates that they can be fully replaced by a single learnable activation mechanism. We identify a plasticity suppression effect induced by standard normalization: learnable activation parameters rapidly lose adaptability when paired with normalization layers.

arXiv CS 7d ago