Home Knowledge Base Rejection Sampling

Rejection Sampling

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Constrained Adaptive Rejection Sampling

Announce Type: replace Abstract: Language Models (LMs) are increasingly used in applications where generated outputs must satisfy strict semantic or syntactic constraints. Existing approaches to constrained generation fall along a spectrum: greedy constrained decoding methods enforce validity during decoding but distort the LM's distribution, while rejection sampling (RS) preserves fidelity but wastes computation by discarding invalid outputs. Both extremes are problematic in domains such as...

arXiv CS 6d ago

Building Reliable Long-Form Generation via Hallucination Rejection Sampling

arXiv:2606.03628v1 Announce Type: new Abstract: Large language models (LLMs) have achieved remarkable progress in open-ended text generation, yet they remain prone to hallucinating incorrect or unsupported content, which undermines their reliability. This issue is exacerbated in long-form generation due to hallucination snowballing, a phenomenon where early errors propagate and compound into subsequent outputs. To address this challenge, we propose a novel inference-time hallucination...

arXiv CS 7d ago

Post-Rejection Follow-up Sampling: A Methodology for Counterfactual Outcome Measurement in Algorithmic DEX Trading

arXiv:2606.08228v1 Announce Type: cross Abstract: Algorithmic trading systems on decentralised exchanges (DEXs) reject most candidate tokens they evaluate. The counterfactual outcome of rejected candidates (what would have happened had the system entered) is rarely measured. This paper introduces Post-Rejection Follow-up Sampling (PRFS).

arXiv CS 1d ago

Improving Selective Classification with Pairwise Queries for Binary Classification

arXiv:2605.30615v1 Announce Type: new Abstract: In selective classification, a model predicts the labels of data samples where it is confident, and abstains from predicting labels for samples on which it is not confident. The rejected samples are often labeled by an expert, which is expensive.

arXiv CS 9d ago

Gradient-Guided Reward Optimization for Inference-time Alignment

arXiv:2606.09635v1 Announce Type: new Abstract: Ensuring the reliability of Large Language Models (LLMs) under distribution drift requires inference-time adaptation. While inference-time alignment methods such as Best-of-$N$ and rejection sampling are widely used, they frame the task as a sampling-intensive, reward-guided search, leading to two key limitations: their performance is bounded by the base model's generation quality, and their reliance on imperfect reward models makes them...

arXiv CS 1d ago

Success Conditioning as Policy Improvement: The Optimization Problem Solved by Imitating Success

arXiv:2601.18175v2 Announce Type: replace Abstract: A widely used technique for improving policies is success conditioning, in which one collects trajectories, identifies those that achieve a desired outcome, and updates the policy to imitate the actions taken along successful trajectories. This principle appears under many names -- rejection sampling with SFT, goal-conditioned RL, Decision Transformers -- yet what optimization problem it solves, if any, has remained unclear. We prove that...

arXiv CS 6d ago

Protecting K-Nearest Neighbor Queries from Location Inference Attacks

Announce Type: new Abstract: The k-nearest neighbor query (kNNQ) is a core component of modern location-based services (LBS) and has been widely adopted in popular features such as ``people nearby''. However, its potential privacy risks have long been overlooked. In this work, we present the first two attacks against kNNQ, namely the geometric intersection location inference attack (GI-LIA) and the zero-order optimization location inference attack (ZO-LIA), revealing the inherent location...

arXiv CS 5d ago

Diffusion Language Model Parallel Decoding via Product-of-Experts Bridge

arXiv:2606.08048v1 Announce Type: new Abstract: Diffusion language models (DLMs) offer substantial speed advantages through parallel decoding, but the lack of token dependencies limits generation quality compared to autoregressive (AR) models. Recent progress attempts to bridge the gap via importance sampling, with DLM being the proposal and AR being the target. However, due to the huge gap between their distributions, the sampling requires a large number of particles and is thus expensive...

arXiv CS 1d ago

UnpredictaBench: A Benchmark for Evaluating Distributional Randomness in LLMs

arXiv:2606.06622v1 Announce Type: new Abstract: We introduce UnpredictaBench, an evaluation that tests the ability of large language models (LLMs) to capture true underlying distributions. As LLMs are increasingly used as substitutes for other entities (e.g., for humans in economic simulations), the tendency of many models to collapse towards a single plausible answer means a failure to capture the unpredictability of real systems. Recent work on improving output diversity is insufficient...

arXiv CS 2d ago

Picasso: Holistic Scene Reconstruction with Physics-Constrained Sampling

arXiv:2602.08058v3 Announce Type: replace Abstract: In the presence of occlusions and measurement noise, geometrically accurate scene reconstructions -- which fit the sensor data -- can still be physically incorrect. For instance, when estimating the poses and shapes of objects in the scene and importing the resulting estimates into a simulator, small errors might translate to implausible configurations including object interpenetration or unstable equilibrium. This makes it difficult to...

arXiv CS 8d ago