Home › Knowledge Base › Gumbel

Gumbel

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Federated Variational Preference Alignment with Gumbel-Softmax Prior for Personalized User Preferences

arXiv:2605.30873v1 Announce Type: new Abstract: Federated Learning (FL) offers a privacy-preserving pathway for aligning Large Language Models (LLMs); however, existing frameworks typically enforce a monolithic reward model, inevitably averaging out inherently conflicting user preferences (e.g., helpfulness vs. harmlessness). While Variational Preference Learning (VPL) offers a pathway to personalization, adapting it to decentralized settings presents a fundamental challenge: posterior...

arXiv CS 9d ago

A hitchhiker's guide to Poisson gradient estimation

arXiv:2602.03896v2 Announce Type: replace-cross Abstract: Poisson-distributed latent variable models are widely used in computational neuroscience, but differentiating through discrete stochastic samples remains challenging. Two approaches address this: *Exponential Arrival Time* (EAT) simulation and *Gumbel-SoftMax* (GSM) relaxation. We provide the first systematic comparison of these methods, along with practical guidance for practitioners.

arXiv CS 9d ago

Learning Temporal Causal Structure via Smooth Differentiable Optimization

arXiv:2606.03227v1 Announce Type: new Abstract: Causal discovery with instantaneous effects in multivariate time series is challenging, as the instantaneous structure must be acyclic. Prior methods enforce this by either separating instantaneous and lagged estimation into multi-stage pipelines or imposing algebraic acyclicity constraints via complex augmented Lagrangian optimization, both of which incur high computational cost.

arXiv CS 7d ago

Testing Most Influential Sets

arXiv:2510.20372v4 Announce Type: replace-cross Abstract: Small influential data subsets can dramatically impact model conclusions, with a few data points overturning key findings. While recent work identifies these most influential sets, there is no formal way to tell when maximum influence is excessive rather than expected under natural random sampling variation. We address this gap by developing a principled framework for most influential sets.

arXiv CS 7d ago

Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning

arXiv:2508.06588v3 Announce Type: replace Abstract: Vector Quantization (VQ) has recently emerged as a promising approach for learning compressed and discrete representations for graph-structured data. However, a fundamental challenge, i.e., codebook collapse, remains underexplored in the graph domain, significantly limiting the expressiveness and generalization of graph tokens. In this paper, we present an empirical study and observe that codebook collapse consistently occurs when training...

arXiv CS 8d ago

Neuro-Symbolic Predictive Process Monitoring

arXiv:2509.00834v2 Announce Type: replace Abstract: This paper addresses the problem of suffix prediction in Business Process Management (BPM) by proposing a Neuro-Symbolic Predictive Process Monitoring (PPM) approach that integrates data-driven learning with temporal logic-based prior knowledge. While recent approaches leverage deep learning models for suffix prediction, they often fail to satisfy even basic logical constraints due to the lack of explicit integration of domain knowledge...

arXiv CS 9d ago

LEAP: Learnable End-to-End Adaptive Pruning of Large Language Models

Announce Type: replace Abstract: Unstructured sparsity is now natively accelerated by recent GPU kernels and dataflow hardware, shifting the bottleneck from inference execution to the pruning algorithm. State-of-the-art methods for unstructured LLM pruning are layer-wise surrogates derived from the Optimal Brain Surgeon principle, and they sacrifice end-to-end accuracy, especially under aggressive sparsity.

arXiv CS 1d ago

Gradient estimators for parameter inference in discrete stochastic kinetic models

Announce Type: replace-cross Abstract: Stochastic kinetic models are ubiquitous in physics, yet inferring their parameters from experimental data remains challenging. For deterministic models, parameter inference often relies on gradients, which can be obtained efficiently through automatic differentiation (AD). However, AD cannot be applied directly to the Gillespie stochastic simulation algorithm (SSA), since sampling from a discrete set of reactions introduces non-differentiable operations.

arXiv CS 6d ago

Gradient estimators for parameter inference in discrete stochastic kinetic models

arXiv:2604.02121v2 Announce Type: replace Abstract: Stochastic kinetic models are ubiquitous in physics, yet inferring their parameters from experimental data remains challenging. For deterministic models, parameter inference often relies on gradients, which can be obtained efficiently through automatic differentiation (AD). However, AD cannot be applied directly to the Gillespie stochastic simulation algorithm (SSA), since sampling from a discrete set of reactions introduces...

arXiv Physics 6d ago