Home Business & Finance Col-Bandit: Query-Time Top-$K$ Estimation for...
Business & Finance

Col-Bandit: Query-Time Top-$K$ Estimation for Late-Interaction Retrieval

Key Points

Announce Type: replace Abstract: Multi-vector late-interaction retrievers such as ColBERT achieve state-of-the-art quality, but their query-time cost is dominated by exhaustively computing token-level MaxSim interactions for every candidate document. The MaxSim scores of $N$ candidates against $T$ query tokens form an $N\times T$ matrix whose row-sums are the late-interaction scores, and identifying the top-$K$ rarely requires every entry. We introduce Col-Bandit, a query-time estimator of...

arXiv:2602.02827v2 Announce Type: replace Abstract: Multi-vector late-interaction retrievers such as ColBERT achieve state-of-the-art quality, but their query-time cost is dominated by exhaustively computing token-level MaxSim interactions for every candidate document. The MaxSim scores of $N$ candidates against $T$ query tokens form an $N\times T$ matrix whose row-sums are the late-interaction scores, and identifying the top-$K$ rarely requires every entry. We introduce Col-Bandit, a query-time estimator of the exhaustive-MaxSim top-$K$: it reveals matrix entries in batches, maintains a finite-population Bernstein-Serfling confidence interval on each candidate's score, and permanently drops any document whose upper bound falls below the $K$-th largest lower bound, computing only the cells needed to separate the top-$K$. A single relaxation knob $\alpha_{\mathrm{ef}}\in(0,1]$ tunes the compute-fidelity trade-off. We deploy $\alpha_{\mathrm{ef}}{=}0.2$, while $\alpha_{\mathrm{ef}}{=}1$ admits a $\delta$-PAC guarantee under a simplified radius. On BEIR and REAL-MM-RAG, Col-Bandit preserves $\geq 90\%$ fidelity to the exhaustive top-$5$ on every corpus while cutting MaxSim FLOPs by up to ${\sim}8\times$, for up to ${\sim}13\times$ single-thread CPU speedups across x86 and ARM. A drop-in reranking layer, it needs no retraining or index changes.
ColBERT (ORG) Bernstein-Serfling (PERSON) Col-Bandit (ORG) fidelity (ORG) CPU (ORG)
Originally published by arXiv CS Read original →