Rejection Sampling
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Constrained Adaptive Rejection Sampling
Announce Type: replace Abstract: Language Models (LMs) are increasingly used in applications where generated outputs must satisfy strict semantic or syntactic constraints. Existing approaches to constrained generation fall along a spectrum: greedy constrained decoding methods enforce validity during decoding but distort the LM's distribution, while rejection sampling (RS) preserves fidelity but wastes computation by discarding invalid outputs. Both extremes are problematic in domains such as...
Building Reliable Long-Form Generation via Hallucination Rejection Sampling
arXiv:2606.03628v1 Announce Type: new Abstract: Large language models (LLMs) have achieved remarkable progress in open-ended text generation, yet they remain prone to hallucinating incorrect or unsupported content, which undermines their reliability. This issue is exacerbated in long-form generation due to hallucination snowballing, a phenomenon where early errors propagate and compound into subsequent outputs. To address this challenge, we propose a novel inference-time hallucination...
Post-Rejection Follow-up Sampling: A Methodology for Counterfactual Outcome Measurement in Algorithmic DEX Trading
arXiv:2606.08228v1 Announce Type: cross Abstract: Algorithmic trading systems on decentralised exchanges (DEXs) reject most candidate tokens they evaluate. The counterfactual outcome of rejected candidates (what would have happened had the system entered) is rarely measured. This paper introduces Post-Rejection Follow-up Sampling (PRFS).
Improving Selective Classification with Pairwise Queries for Binary Classification
arXiv:2605.30615v1 Announce Type: new Abstract: In selective classification, a model predicts the labels of data samples where it is confident, and abstains from predicting labels for samples on which it is not confident. The rejected samples are often labeled by an expert, which is expensive.
Gradient-Guided Reward Optimization for Inference-time Alignment
arXiv:2606.09635v1 Announce Type: new Abstract: Ensuring the reliability of Large Language Models (LLMs) under distribution drift requires inference-time adaptation. While inference-time alignment methods such as Best-of-$N$ and rejection sampling are widely used, they frame the task as a sampling-intensive, reward-guided search, leading to two key limitations: their performance is bounded by the base model's generation quality, and their reliance on imperfect reward models makes them...
Success Conditioning as Policy Improvement: The Optimization Problem Solved by Imitating Success
arXiv:2601.18175v2 Announce Type: replace Abstract: A widely used technique for improving policies is success conditioning, in which one collects trajectories, identifies those that achieve a desired outcome, and updates the policy to imitate the actions taken along successful trajectories. This principle appears under many names -- rejection sampling with SFT, goal-conditioned RL, Decision Transformers -- yet what optimization problem it solves, if any, has remained unclear. We prove that...
Protecting K-Nearest Neighbor Queries from Location Inference Attacks
Announce Type: new Abstract: The k-nearest neighbor query (kNNQ) is a core component of modern location-based services (LBS) and has been widely adopted in popular features such as ``people nearby''. However, its potential privacy risks have long been overlooked. In this work, we present the first two attacks against kNNQ, namely the geometric intersection location inference attack (GI-LIA) and the zero-order optimization location inference attack (ZO-LIA), revealing the inherent location...
Diffusion Language Model Parallel Decoding via Product-of-Experts Bridge
arXiv:2606.08048v1 Announce Type: new Abstract: Diffusion language models (DLMs) offer substantial speed advantages through parallel decoding, but the lack of token dependencies limits generation quality compared to autoregressive (AR) models. Recent progress attempts to bridge the gap via importance sampling, with DLM being the proposal and AR being the target. However, due to the huge gap between their distributions, the sampling requires a large number of particles and is thus expensive...
UnpredictaBench: A Benchmark for Evaluating Distributional Randomness in LLMs
arXiv:2606.06622v1 Announce Type: new Abstract: We introduce UnpredictaBench, an evaluation that tests the ability of large language models (LLMs) to capture true underlying distributions. As LLMs are increasingly used as substitutes for other entities (e.g., for humans in economic simulations), the tendency of many models to collapse towards a single plausible answer means a failure to capture the unpredictability of real systems. Recent work on improving output diversity is insufficient...
Picasso: Holistic Scene Reconstruction with Physics-Constrained Sampling
arXiv:2602.08058v3 Announce Type: replace Abstract: In the presence of occlusions and measurement noise, geometrically accurate scene reconstructions -- which fit the sensor data -- can still be physically incorrect. For instance, when estimating the poses and shapes of objects in the scene and importing the resulting estimates into a simulator, small errors might translate to implausible configurations including object interpenetration or unstable equilibrium. This makes it difficult to...