Home Knowledge Base BEIR

BEIR

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Test-Time Training for Zero-Resource Dense Retrieval Reranking

Announce Type: new Abstract: Dense retrievers excel at first-stage candidate generation but lack effective reranking in zero-resource settings. Existing approaches face a fundamental dilemma: cross-encoders deliver strong reranking quality but require costly supervised training and incur high latency, while unsupervised BM25 reranking consistently degrades dense retrieval performance on most of BEIR benchmarks. We propose DART (Dense Adaptive Reranking at Test-time), which resolves this...

arXiv CS 8d ago

DynaTree: Dynamic Agentic Retrieval Tree for Time-Sensitive News Retrieval

arXiv:2605.31377v1 Announce Type: new Abstract: Agentic Retrieval-Augmented Generation improves retrieval by integrating planning, tool use, and iterative reasoning, but existing agentic RAG methods often couple semantic expansion with retrieval decisions in short-horizon inference loops, leading to high inference cost and limited suitability for time-sensitive news retrieval. We propose DynaTree, a two-stage framework for efficient and adaptive news retrieval. In the offline stage, DynaTree...

arXiv CS 9d ago

Col-Bandit: Query-Time Top-$K$ Estimation for Late-Interaction Retrieval

Announce Type: replace Abstract: Multi-vector late-interaction retrievers such as ColBERT achieve state-of-the-art quality, but their query-time cost is dominated by exhaustively computing token-level MaxSim interactions for every candidate document. The MaxSim scores of $N$ candidates against $T$ query tokens form an $N\times T$ matrix whose row-sums are the late-interaction scores, and identifying the top-$K$ rarely requires every entry. We introduce Col-Bandit, a query-time estimator of...

arXiv CS 7d ago

$\mathrm{ECI}_{\mathrm{sem}}$: Semantic Residual Effective Contrastive Information for Evaluating Hard Negatives

arXiv:2603.20990v3 Announce Type: replace Abstract: Hard-negative source selection for dense retrieval is usually decided only after fine-tuning and downstream evaluation. We propose $\mathrm{ECI}_{\mathrm{sem}}$, a semantic residual variant of Effective Contrastive Information (ECI) that ranks candidate negative sources using frozen target-encoder embeddings. $\mathrm{ECI}_{\mathrm{sem}}$ is training-free, not label-free: each scored example requires a query, a labeled positive, and an...

arXiv CS 2d ago

Superintelligent Retrieval Agent: The Next Frontier of Agentic Retrieval

arXiv:2605.06647v2 Announce Type: replace Abstract: Retrieval-augmented agents are increasingly the interface to large knowledge bases, yet most treat retrieval as a black box: they issue exploratory queries, inspect snippets, and reformulate until evidence emerges. This resembles how a newcomer searches an unfamiliar database rather than how an expert navigates it with strong priors about terminology and likely evidence, causing extra retrieval rounds, latency, and poor recall. We introduce...

arXiv CS 2d ago

No More K-means: Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

arXiv:2605.30120v2 Announce Type: replace Abstract: Multi-vector retrieval (MVR) models, exemplified by ColBERT, have established new benchmarks in retrieval accuracy by preserving fine-grained token-level interactions. However, this granularity imposes prohibitive storage and retrieval efficiency bottlenecks: to manage the immense memory footprint and computational overhead of billion-scale token vectors, state-of-the-art systems are forced to rely on aggressive dimension reduction and...

arXiv CS 9d ago

No More K-means: Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

arXiv:2605.30120v3 Announce Type: replace Abstract: Multi-vector retrieval (MVR) models, exemplified by ColBERT, have established new benchmarks in retrieval accuracy by preserving fine-grained token-level interactions. However, this granularity imposes prohibitive storage and retrieval efficiency bottlenecks: to manage the immense memory footprint and computational overhead of billion-scale token vectors, state-of-the-art systems are forced to rely on aggressive dimension reduction and...

arXiv CS 6d ago

ECI: Effective Contrastive Information to Evaluate Hard-Negatives

arXiv:2603.20990v2 Announce Type: replace Abstract: Hard-negative source selection for dense retrieval is usually decided only after fine-tuning and downstream evaluation. We propose Effective Contrastive Information (ECI), a training-free diagnostic that ranks candidate negative sources using frozen target-encoder embeddings. ECI is training-free, not label-free: each scored example requires a query, a labeled positive, and an explicit candidate negative.

arXiv CS 5d ago