Sigmoid
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
SmartMixed: A Two-Phase Training Strategy for Adaptive Activation Function Learning in Neural Networks
Announce Type: replace Abstract: The choice of activation function plays a critical role in neural networks, yet most architectures still rely on fixed, uniform activation functions across all neurons. We introduce SmartMixed, a novel two-phase training strategy that allows networks to learn optimal per-neuron activation functions while preserving computational efficiency at inference. In the first phase, neurons adaptively select from a pool of candidate activation functions (ReLU, Sigmoid,...
Capacity-Controlled Global Attention for Graph Transformers
arXiv:2604.17324v2 Announce Type: replace Abstract: Global self-attention drives modern graph transformers, yet the softmax at its core imposes a structural constraint rarely examined directly: every attention row is non-negative and sums to one, so each per-head output is a mass-conserving convex combination of value vectors. A node can never "attend to nothing." We argue this conservation constraint is a single root cause behind three pathologies usually studied in isolation: the collapse...
Genomic Dimensionality Bounds Mixed-Model Association Power and Fine-Mapping Resolution
Mixed-model genome-wide association studies (GWAS) behave differently in livestock than in humans, yet a unified explanation is lacking. Analyses using the full genomic relationship matrix (full-GRM; from genome-wide SNPs) yield only a few significant peaks even with hundreds of thousands of animals, whereas leave-one-chromosome-out (LOCO), numerator-relationship-matrix, and sparse-GRM approaches report many broad associations over similar data. Here we develop a framework that traces these...
Human-Like Neural Nets by Catapulting
Human-like Neural Nets by Catapulting Speculative proposal to create artificial neural nets with human-like performance by high-learning-rate/regularization training of overparameterized NNs to trigger catapulting/grokking. Over-parameterization as a route to true generalization would resolve many outstanding mysteries of artificial versus natural intelligence. There are many mysteries about deep learning and human intelligence, but we could describe the biggest anomaly this way: why are...
Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer
arXiv:2604.09414v5 Announce Type: replace-cross Abstract: A learning-to-defer (L2D) system decides, for each input, whether to predict on its own or to hand it to one of several available experts. The very well established recipe trains classifier and router jointly by treating the $K$ classes and $J$ experts as competing actions in one shared $(K{+}J)$-action geometry. Subsequent work has proposed a series of incremental fixes within this geometry; we show that each still suffers, to...
Delayed Repression and Emergent Instability in Adaptive Multi-Agent Systems
arXiv:2605.30392v2 Announce Type: replace Abstract: Regulatory institutions (from content moderation platforms to financial supervisors) observe, deliberate, and intervene only after a characteristic delay. We ask whether this processing lag alone can destabilize a multi-agent system that would otherwise remain stable, without exogenous shocks, coordination among agents, or malicious actors. We study this in two stages.
Separation Power of Equivariant Neural Networks
arXiv:2406.08966v3 Announce Type: replace Abstract: The separation power of a machine learning model refers to its ability to distinguish between different inputs and is often used as a proxy for its expressivity. Indeed, knowing the separation power of a family of models is a necessary condition to obtain fine-grained universality results. In this paper, we analyze the separation power of equivariant neural networks, such as convolutional and permutation-invariant networks.
Millimeter-Wave UAV Channel Model with Height-Dependent Path Loss and Shadowing in Urban Scenarios
arXiv:2511.10763v2 Announce Type: replace Abstract: Uncrewed Aerial Vehicles (UAVs) serving as Aerial Base Stations (ABSs) are expected to extend 6G millimeter-Wave (mmWave) coverage and improve link reliability in urban areas. However, UAV-based Air-to-Ground (A2G) channels are highly dependent on height and urban geometry. This paper proposes an ABS height-dependent mmWave channel model and investigates whether urban geometry, beyond the standard built-up parameters, significantly affects...
Short-Term Developmental Trajectories of Dorsal-Ventral Pathways and Their Relationships with First-Grade Learning
The first year of formal schooling is a year of foundational reading and math learning, and individual differences emerging within this single year predict academic achievement decades later. Yet, how brain changes throughout this critical year relate to individual differences in reading and math learning remains uncharacterized. In this pre-registered study (https://osf.io/97ybe), we acquired monthly both behavioral assessments of reading- and math-learning, and diffusion-weighted MRI scans...
AttackPathGNN: Cross-function vulnerability detection in smart contracts using state interference graphs and conjunction pooling
arXiv:2606.05986v1 Announce Type: new Abstract: Existing learning-based detectors for Solidity smart-contracts reduce vulnerability detection to syntactic pattern matching within single functions, yet many of the most consequential exploits (The DAO, Cream Finance) exist not in any individual function but in the relationship between functions and in the combination of conditions that made the attack feasible. Thus, we propose AttackPathGNN, a graph neural network (GNN) that reframes...