Bayesian Layers
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Memory by Design: Probabilistic Sequence Layers
arXiv:2605.31163v1 Announce Type: cross Abstract: We introduce the design-model framework: a way to derive efficient recurrent sequence maps from explicit assumptions about memory. A design model writes evidence into memory by exact Bayesian filtering; a query-dependent readout produces a predictive distribution whose mean is the layer output. In our linear-Gaussian instantiation, the \emph{Bayesian Layer} propagates both a mean and a covariance: the covariance tracks uncertainty over stored...
How Deep Are Deep GPs, Really? A Sharp Threshold and a Non-Gaussian Limit for Compositional GPs
arXiv:2606.08218v1 Announce Type: new Abstract: Compositional priors describe the generic properties of layered functions in deep Bayesian models, where deep neural networks with random weights are a canonical example. In the wide-network limit, the prior is a Gaussian process with a depth-dependent kernel, and its behaviour as depth grows has been extensively studied through this kernel. Here, we study another case, where each layer itself is a vector valued Gaussian process, and our aim is...
Variational Routing: A Scalable Bayesian Framework for Calibrated Mixture-of-Experts Transformers
Announce Type: replace Abstract: Foundation models are increasingly being deployed in contexts where understanding the uncertainty of their outputs is critical to ensuring responsible deployment. While Bayesian methods offer a principled approach to uncertainty quantification, their computational overhead renders their use impractical for training or inference at foundation model scale. State-of-the-art models achieve parameter counts in the trillions through carefully engineered sparsity...
Beyond Post-hoc Explanation: Toward Glassbox AI via Probabilistic Mediation
Announce Type: new Abstract: Large language models are rapidly becoming infrastructural components in high-stakes institutional settings, including public administration, legal reasoning, and healthcare, where opacity is not merely inconvenient but institutionally and legally untenable. Existing approaches to explainability are predominantly post-hoc, offering unstable, non-contestable accounts that have no formal relationship to the reasoning process that produced the output. We argue that...
Learning Whom to Trust: Market-Feedback Adaptive Retrieval for Frozen LLMs in Event-Driven Financial RAG
arXiv:2605.31201v1 Announce Type: new Abstract: Financial retrieval-augmented generation (RAG) systems typically rank evidence by textual relevance, but in financial markets the useful evidence source depends on event type, forecast horizon, and market context. We study news-triggered event-impact prediction as a point-in-time financial RAG problem. For each company-news anchor, the system retrieves related financial news and SEC filing passages, appends a pre-decision market-context card,...
Target localization, identification and sensing using latent symmetries
arXiv:2606.01421v1 Announce Type: new Abstract: We show that an array of scatterers which has been designed to have latent ("hidden") symmetries can be used as a sensor. We use the capacitance matrix as a canonical model for three-dimensional hybridisation and study how the introduction of an "intruder'' scatterer breaks the latent symmetries. By analysing the degree to which each symmetry is broken, we identify the radius of the intruder and localize its position.
Structure-Preserving Correction Learning for Sparse Bayesian Inference in Brain Source Imaging
Announce Type: new Abstract: Classical sparse Type-II Bayesian methods for M/EEG brain imaging support joint estimation of source and noise hyperparameters, but rely on fixed iterative update rules. Although these updates are principled and interpretable, their dynamics cannot be adapted from data. We propose to learn the update mechanism itself while preserving the underlying Bayesian structure by unfolding a classical joint hyperparameter-learning solver into a trainable neural...
Is the Last Layer Sufficient for Uncertainty Quantification?
Announce Type: cross Abstract: Epistemic uncertainty quantification (UQ) for deep neural networks (DNNs) is a requirement for safe adoption of AI in mission-critical settings. Several leading methods for UQ linearize DNNs to form Bayesian Generalized Linear Models (GLMs), where epistemic uncertainty is modeled via the predictive posterior distribution. Linearizing around the parameters of the final connected layer of a DNN is a commonly used approximation for reducing the computational...
Bayesian Inference with Shaped Deep Non-linear MLPs
arXiv:2605.30860v1 Announce Type: cross Abstract: A central aim of deep learning theory is to characterize how neural networks make predictions in the regime of simultaneously large model and training set size. Since the limits of diverging number of model parameters and dataset size do not commute it is not clear a priori what limits exist. In this work, we shed new light on these questions by studying Bayesian inference in deep non-linear MLPs in the regime where the number of training...
Learning and Inferring Multiphase Flow Dynamics in Porous Media using Scientific Machine Learning: Application to the "FluidFlower" CO2 Injection Experiment
Announce Type: replace Abstract: Accurate prediction and parameter identification of multiphase flow in porous media remain central challenges in geological carbon dioxide storage due to strong nonlinearities, high-dimensional parameter spaces, and limited observational data. We present a machine learning framework that integrates surrogate modeling and Bayesian inference to enable efficient forward prediction and inverse parameter estimation for CO2-brine flows in geological media. The...