Home Knowledge Base Multilingual Information Retrieval

Multilingual Information Retrieval

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

MIMO: Multilingual Information Retrieval via Monolingual Objectives

Announce Type: new Abstract: Multilingual Information Retrieval (MLIR) reflects real-world search environments in which queries and relevant documents may appear in different languages within a mixed-language corpus. However, existing embedding models are primarily optimized for Multi-Monolingual retrieval and their performance often degrades in MLIR settings. Moreover, directly applying conventional contrastive learning to MLIR can exacerbate language clustering and expose a trade-off...

arXiv CS 9d ago

CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval

arXiv:2506.11066v3 Announce Type: replace Abstract: Code retrieval is essential in modern software development, as it boosts code reuse and accelerates debugging. However, current benchmarks primarily emphasize functional relevance while neglecting critical dimensions of software quality. Motivated by this gap, we introduce CoQuIR, the first large-scale, multilingual benchmark specifically designed to evaluate quality-aware code retrieval across four key dimensions: correctness, efficiency,...

arXiv CS 2d ago

Linguistic Nepotism: Trading-off Quality for Language Preference in Multilingual RAG

arXiv:2509.13930v3 Announce Type: replace Abstract: Multilingual Retrieval-Augmented Generation (mRAG) systems enable language models to answer knowledge-intensive queries with citation-supported responses across languages. Despite their growing use, an open questions is whether the mixture of different document languages impacts generation and citation behavior in unintended ways. To investigate this, we introduce a controlled methodology using model internals to measure language preference...

arXiv CS 1d ago

AgriGov: A Structured Multilingual Dataset Curation for Indian Government Schemes for Farmers

arXiv:2606.08272v1 Announce Type: new Abstract: AgriGov is a curated, trilingual (English-Hindi-Marathi) dataset designed to address the scarcity of domain-grounded multilingual resources for agricultural policies and farmer welfare schemes. Initially, we collected and structured data from 50 government schemes sourced from trusted portals using automated scraping techniques, organizing it into predefined semantic fields (e.g., title, eligibility, application process, documents, exclusions).

arXiv CS 1d ago

Introducing multiplex semantic networks as multifaceted representations of creative associative knowledge across multilingual samples

arXiv:2606.09403v1 Announce Type: new Abstract: Creativity is a complex cognitive ability that relies on knowledge organisation and retrieval from semantic memory. Yet most research uses a single task to measure it, capturing only a fraction of this complexity. This study investigates multiplex networks - layered semantic networks obtained from six cognitive tasks - as a more comprehensive approach to modelling the associative knowledge underlying creativity.

arXiv CS 1d ago