Multi-Sample
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
When Does Delegation Beat Majority? A Delegation-Based Aggregator for Multi-Sample LLM Inference
arXiv:2606.08098v1 Announce Type: new Abstract: Majority voting over sampled answers is the dominant unsupervised aggregator for multi-sample LLM inference. We show that piping the signals every sample carries into a delegation-based aggregator (Propagational Proxy Voting, PPV) yields an unsupervised consensus rule that beats majority on MMLU-Pro by +1.5 pp overall and +2.24 pp on the non-trivial subset (paired McNemar p ~ 1.0e-14, n = 8,099).
X-ray Response of the Fully-Depleted, p-Channel SiSeRO-CCD
Announce Type: replace Abstract: We present an X-ray characterization of a fully depleted, 725 $\mu$m thick p-channel SiSeRO-CCD. Measurements with a $^{55}$Fe source yield an energy resolution of $54 \pm 0.9$ eV ($14.6 \pm 0.25 e^{-}$) at 5.9 keV for single-pixel events, indicating that the SiSeRO amplifier preserves the intrinsic charge resolution of the CCD under multi-sample non-destructive readout. Characterization with a $^{241}$Am source extends the response to higher-energy photons,...
S23DR 2026 Winning Solution
arXiv:2606.06695v1 Announce Type: new Abstract: This text presents the winning solution to the S23DR 2026 challenge for structured 3D wireframe reconstruction from sparse SfM, fitted depth, and semantic segmentations. The method treats vertices as a conditional set and denoises 64 vertex tokens with a flow-matching DiT conditioned on Perceiver-style scene tokens. A global pass predicts the coarse structure, a hull-cropped second pass refines it, and a small multi-sample consensus step keeps...
ThinkBooster: A Unified Framework for Seamless Test-Time Scaling of LLM Reasoning
Announce Type: new Abstract: Test-time compute (TTC) scaling has emerged as a powerful paradigm for improving large language model (LLM) reasoning by allocating additional compute during inference, e.g., via multi-sample generation and verifier-based reranking. Existing TTC scaling strategies and reasoning scorers remain fragmented, evaluated under inconsistent protocols, and are rarely analyzed through the lens of quality-cost trade-offs. We introduce ThinkBooster, a unified framework for...
ThinkBooster: A Unified Framework for Seamless Test-Time Scaling of LLM Reasoning
arXiv:2606.06915v2 Announce Type: replace Abstract: Test-time compute (TTC) scaling has emerged as a powerful paradigm for improving large language model (LLM) reasoning by allocating additional compute during inference, e.g., via multi-sample generation and verifier-based reranking. Existing TTC scaling strategies and reasoning scorers remain fragmented, evaluated under inconsistent protocols, and are rarely analyzed through the lens of quality-cost trade-offs. We introduce ThinkBooster, a...
ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models
arXiv:2605.18879v3 Announce Type: replace Abstract: Large language models inevitably retain sensitive information, defined as inputs that may induce harmful generations, due to training on massive web corpora, raising concerns for privacy and safety. Existing machine unlearning methods primarily rely on retraining or aggressive fine-tuning, which are either computationally expensive or prone to degrading related knowledge and overall model utility. In this work, we reformulate machine...
Cellpin enables reference-based imputation and denoising of spatial transcriptomes
Spatially resolved transcriptomics enables gene expression profiling within tissue architecture, but targeted panels leave much of the transcriptome unmeasured and spatial artifacts such as RNA diffusion and segmentation errors introduce technical noise. These limitations necessitate computational imputation and denoising, yet existing methods typically incorporate spatial measurements during training, limiting scalability and risking the embedding of technology-specific artifacts into...