Home Knowledge Base Sigmoid

Sigmoid

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

SmartMixed: A Two-Phase Training Strategy for Adaptive Activation Function Learning in Neural Networks

Announce Type: replace Abstract: The choice of activation function plays a critical role in neural networks, yet most architectures still rely on fixed, uniform activation functions across all neurons. We introduce SmartMixed, a novel two-phase training strategy that allows networks to learn optimal per-neuron activation functions while preserving computational efficiency at inference. In the first phase, neurons adaptively select from a pool of candidate activation functions (ReLU, Sigmoid,...

arXiv CS 1d ago

Capacity-Controlled Global Attention for Graph Transformers

arXiv:2604.17324v2 Announce Type: replace Abstract: Global self-attention drives modern graph transformers, yet the softmax at its core imposes a structural constraint rarely examined directly: every attention row is non-negative and sums to one, so each per-head output is a mass-conserving convex combination of value vectors. A node can never "attend to nothing." We argue this conservation constraint is a single root cause behind three pathologies usually studied in isolation: the collapse...

arXiv CS 1d ago

Genomic Dimensionality Bounds Mixed-Model Association Power and Fine-Mapping Resolution

Mixed-model genome-wide association studies (GWAS) behave differently in livestock than in humans, yet a unified explanation is lacking. Analyses using the full genomic relationship matrix (full-GRM; from genome-wide SNPs) yield only a few significant peaks even with hundreds of thousands of animals, whereas leave-one-chromosome-out (LOCO), numerator-relationship-matrix, and sparse-GRM approaches report many broad associations over similar data. Here we develop a framework that traces these...

bioRxiv 5d ago

Human-Like Neural Nets by Catapulting

Human-like Neural Nets by Catapulting Speculative proposal to create artificial neural nets with human-like performance by high-learning-rate/regularization training of overparameterized NNs to trigger catapulting/grokking. Over-parameterization as a route to true generalization would resolve many outstanding mysteries of artificial versus natural intelligence. There are many mysteries about deep learning and human intelligence, but we could describe the biggest anomaly this way: why are...

Hacker News 3d ago

Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer

arXiv:2604.09414v5 Announce Type: replace-cross Abstract: A learning-to-defer (L2D) system decides, for each input, whether to predict on its own or to hand it to one of several available experts. The very well established recipe trains classifier and router jointly by treating the $K$ classes and $J$ experts as competing actions in one shared $(K{+}J)$-action geometry. Subsequent work has proposed a series of incremental fixes within this geometry; we show that each still suffers, to...

arXiv CS 9d ago

Delayed Repression and Emergent Instability in Adaptive Multi-Agent Systems

arXiv:2605.30392v2 Announce Type: replace Abstract: Regulatory institutions (from content moderation platforms to financial supervisors) observe, deliberate, and intervene only after a characteristic delay. We ask whether this processing lag alone can destabilize a multi-agent system that would otherwise remain stable, without exogenous shocks, coordination among agents, or malicious actors. We study this in two stages.

arXiv CS 7d ago

Separation Power of Equivariant Neural Networks

arXiv:2406.08966v3 Announce Type: replace Abstract: The separation power of a machine learning model refers to its ability to distinguish between different inputs and is often used as a proxy for its expressivity. Indeed, knowing the separation power of a family of models is a necessary condition to obtain fine-grained universality results. In this paper, we analyze the separation power of equivariant neural networks, such as convolutional and permutation-invariant networks.

arXiv CS 5d ago

Millimeter-Wave UAV Channel Model with Height-Dependent Path Loss and Shadowing in Urban Scenarios

arXiv:2511.10763v2 Announce Type: replace Abstract: Uncrewed Aerial Vehicles (UAVs) serving as Aerial Base Stations (ABSs) are expected to extend 6G millimeter-Wave (mmWave) coverage and improve link reliability in urban areas. However, UAV-based Air-to-Ground (A2G) channels are highly dependent on height and urban geometry. This paper proposes an ABS height-dependent mmWave channel model and investigates whether urban geometry, beyond the standard built-up parameters, significantly affects...

arXiv CS 2d ago

Short-Term Developmental Trajectories of Dorsal-Ventral Pathways and Their Relationships with First-Grade Learning

The first year of formal schooling is a year of foundational reading and math learning, and individual differences emerging within this single year predict academic achievement decades later. Yet, how brain changes throughout this critical year relate to individual differences in reading and math learning remains uncharacterized. In this pre-registered study (https://osf.io/97ybe), we acquired monthly both behavioral assessments of reading- and math-learning, and diffusion-weighted MRI scans...

bioRxiv 7d ago

AttackPathGNN: Cross-function vulnerability detection in smart contracts using state interference graphs and conjunction pooling

arXiv:2606.05986v1 Announce Type: new Abstract: Existing learning-based detectors for Solidity smart-contracts reduce vulnerability detection to syntactic pattern matching within single functions, yet many of the most consequential exploits (The DAO, Cream Finance) exist not in any individual function but in the relationship between functions and in the combination of conditions that made the attack feasible. Thus, we propose AttackPathGNN, a graph neural network (GNN) that reframes...

arXiv CS 5d ago