ALMAB-DC: Active Learning, Multi-Armed Bandits, and Distributed Computing for Sequential Experimental Design and Black-Box Optimization

arXiv CS Thursday 04 June 2026, 04:00 UTC By Foo Hui-Mean, Yuan-chin I Chang 1 min read

Key Points

arXiv:2603.21180v4 Announce Type: replace Abstract: Sequential experimental design under expensive, gradient-free objectives is a central challenge in computational statistics: evaluation budgets are tightly constrained and information must be extracted efficiently from each observation. We propose \textbf{ALMAB-DC}, a GP-based sequential design framework combining active learning, multi-armed bandits (MAB), and distributed asynchronous computing for expensive black-box experimentation. A Gaussian process surrogate with uncertainty-aware acquisition identifies informative query points; a UCB or Thompson-sampling bandit controller allocates evaluations across parallel workers; and an asynchronous scheduler handles heterogeneous runtimes. We present cumulative regret bounds for the bandit components and characterize parallel scalability via Amdahl's Law. We validate ALMAB-DC on five benchmarks. On the two statistical experimental-design tasks, ALMAB-DC achieves lower simple regret than Equal Spacing, Random, and D-optimal designs in dose--response optimization, and in adaptive spatial field estimation matches the Greedy Max-Variance benchmark while outperforming Latin Hypercube Sampling; at $K=4$ the distributed setting reaches target performance in one-quarter of sequential wall-clock rounds. On three ML/engineering tasks (CIFAR-10 HPO, CFD drag minimization, MuJoCo RL), ALMAB-DC achieves 93.4\% CIFAR-10 accuracy (outperforming BOHB by 1.7\,pp and Optuna by 1.1\,pp), reduces airfoil drag to $C_D = 0.059$ (36.9\% below Grid Search), and improves RL return by 50\% over Grid Search. All advantages over non-ALMAB baselines are statistically significant under Bonferroni-corrected Mann--Whitney $U$ tests. Distributed execution achieves $7.5\times$ speedup at $K = 16$ agents, consistent with Amdahl's Law.

ALMAB-DC (LOCATION) Active Learning (ORG) Distributed Computing for Sequential Experimental Design (ORG) Black-Box Optimization arXiv:2603.21180v4 Announce Type (ORG) GP (ORG) MAB (ORG) UCB (ORG) Thompson (ORG) Amdahl (PERSON) Equal Spacing (ORG) Latin Hypercube (ORG) HPO (ORG) CFD (ORG) RL (ORG) Grid Search (ORG)

Originally published by arXiv CS Read original →

ALMAB-DC: Active Learning, Multi-Armed Bandits, and Distributed Computing for Sequential Experimental Design and Black-Box Optimization

Related Stories

Rare tiger cub from litter of four dies

The SpaceX IPO could lead to 8% of America’s current-account deficit being refinanced in a single day

'Don’t give parents more to do to keep kids safe online - they need help, not homework'

Pollinators in peril: scientists reveal the hidden human health costs of the world’s disappearing bees