Home › Knowledge Base › Research Gap Inference

Research Gap Inference

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Multi-Turn Evaluation of Deep Research Agents Under Process-Level Feedback

arXiv:2606.09748v1 Announce Type: new Abstract: Existing benchmarks for deep research agents (DRAs) assess only single-shot outputs, ignoring a key question: can DRAs improve their reports when guided by feedback? To investigate this, we conduct a multi-turn evaluation of DRAs under two feedback settings: self-reflection, in which the agent revises its report without any external diagnostic signal, and process-level feedback, in which the agent receives guidance targeting gaps in its...

arXiv CS 1d ago

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

arXiv:2606.01961v2 Announce Type: replace Abstract: Autonomous agents are increasingly expected to support end-to-end medical-AI research workflows, moving beyond isolated prediction tasks or short-form clinical question answering. However, existing medical agent benchmarks primarily evaluate final outputs, providing limited visibility into agent behavior within the research process. To address this gap, we present AutoMedBench, a workflow-aware benchmark for autonomous medical-AI research...

arXiv CS 6d ago

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

arXiv:2606.01961v1 Announce Type: new Abstract: Autonomous agents are increasingly expected to support end-to-end medical-AI research workflows, moving beyond isolated prediction tasks or short-form clinical question answering. However, existing medical agent benchmarks primarily evaluate final outputs, providing limited visibility into agent behavior within the research process. To address this gap, we present AutoMedBench, a workflow-aware benchmark for autonomous medical-AI research...

arXiv CS 8d ago

Automated IEP Generation from Traditional Chinese Parent-Teacher Interviews via Corpus-Grounded Feature Diffusion

arXiv:2606.09603v1 Announce Type: new Abstract: Writing Individualized Education Programs (IEPs) is a high-labor, knowledge-intensive document burden; English-language research has demonstrated that generative AI can significantly reduce drafting time, yet automated IEP generation in Traditional Chinese remains virtually unexplored due to domain data scarcity, strict privacy regulations, and the absence of local evaluation benchmarks. We propose a low-resource fine-tuning pipeline centered...

arXiv CS 1d ago

Argonne flexes spare supercompute to build private AI inference service

The Register 13d ago

Test-Time Scaling in Multimodal Foundation Models: A Comprehensive Survey of Generation and Reasoning

arXiv:2606.08231v1 Announce Type: new Abstract: Test-time Scaling (TTS) has emerged as a pivotal research direction for enhancing model performance by dynamically allocating computational resources during inference. Recent advancements have adapted this paradigm to Multimodal Foundation Models (MFMs), unlocking their potential in multimodal reasoning and generation. Despite rapid progress, the field lacks a systematic survey and unified theoretical framework to delineate the developmental...

arXiv CS 1d ago

When AI Builds Itself: Our progress toward recursive self-improvement

For most of AI’s history, humans drove every step in its development cycle. But at Anthropic, we are delegating a growing share of AI development to AI systems themselves, which is speeding up our work. Taken far enough, and given enough compute, that trend points to an AI system capable of fully autonomously designing and developing its own successor.

Hacker News 6d ago

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second

From the first roaring racer of the combustion age to the sonic boom that shattered the sound barrier, humanity's hunger for speed is written into our very DNA. The speed of AI reasoning is no different — it defines the boundaries of intelligence itself. When a model is fast enough, it ceases to be a tool you wait on and becomes an extension of your own thinking: responding in real time, iterating in an instant, collaborating without friction.

Hacker News 2d ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Announce Type: replace Abstract: As the foundational architecture of modern machine learning, Transformers have driven remarkable progress across diverse AI domains. Despite their transformative impact, a persistent challenge across various Transformers is Attention Sink (AS), in which a disproportionate amount of attention is focused on a small subset of specific yet uninformative tokens. AS complicates interpretability, significantly affecting the training and inference dynamics, and...

arXiv CS 2d ago

MindVoice: Reconstructing Intelligible Speech from Non-invasive Neural Signals with Pretrained Priors

arXiv:2605.31173v1 Announce Type: new Abstract: Reconstructing continuous speech from non-invasive neural recordings is a fundamental problem for probing human auditory perception and building safe, scalable speech brain-computer interfaces. Despite recent progress, intelligible reconstruction remains elusive, as non-invasive recordings are inherently noisy, spatially blurred, and only partially preserve information about perceived speech. Existing methods directly map neural activity to...

arXiv CS 9d ago