Home Knowledge Base Framework for Evidence-Grounded

Framework for Evidence-Grounded

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Extending AI for Research to the Humanities: A Multi-Agent Framework for Evidence-Grounded Scholarship

arXiv:2605.30947v3 Announce Type: replace Abstract: LLM-based research agents have advanced rapidly in science and engineering, where research is organized around executable experiments, code, and quantitative signals. Humanities scholarship, however, requires a different mode of reasoning: interpretive, evidence-grounded argument over primary sources, where scholarly value depends on faithful quotation, verifiable provenance, and close reading. Existing research agents remain largely...

arXiv CS 6d ago

Extending AI for Research to the Humanities: A Multi-Agent Framework for Evidence-Grounded Scholarship

Announce Type: new Abstract: LLM-based research agents have advanced rapidly in science and engineering, where research is organized around executable experiments, code, and quantitative signals. Humanities scholarship, however, requires a different mode of reasoning: interpretive, evidence-grounded argument over primary sources, where scholarly value depends on faithful quotation, verifiable provenance, and close reading. Existing research agents remain largely optimized for execution and...

arXiv CS 9d ago

Extending AI for Research to the Humanities: A Multi-Agent Framework for Evidence-Grounded Scholarship

arXiv:2605.30947v2 Announce Type: replace Abstract: LLM-based research agents have advanced rapidly in science and engineering, where research is organized around executable experiments, code, and quantitative signals. Humanities scholarship, however, requires a different mode of reasoning: interpretive, evidence-grounded argument over primary sources, where scholarly value depends on faithful quotation, verifiable provenance, and close reading. Existing research agents remain largely...

arXiv CS 7d ago

EGTR-Review: Efficient Evidence-Grounded Scientific Peer Review Generation via Multi-Agent Teacher Distillation

arXiv:2606.06025v1 Announce Type: new Abstract: Scientific peer review generation has attracted increasing attention for reducing reviewing burdens and providing timely feedback. However, existing Large Language Model (LLM)-based methods often produce generic comments with insufficient evidence support and weak source traceability, while complex multi-agent systems incur high inference costs. To address these challenges, we propose EGTR-Review, an Evidence-Grounded and Traceable Review...

arXiv CS 5d ago

Better with Experience: Self-Evolving LLM Agents for Evidence-Grounded Health Community Notes

new Abstract: Large Language Model (LLM)-augmented Community Notes offer a scalable path for timely, evidence-grounded correction of health misinformation on social platforms. However, they still reset at every post, leaving useful correction experience from prior cases unused. We introduce EvoNote, an agentic framework that enables health Community Notes generation to self-evolve through an evolving experience memory of prior misinformation correction episodes.

arXiv CS 8d ago

Evidence-Grounded Ensemble Diagnosis of 802.11 Packet Captures: A Multi-Stage Pipeline with Deterministic Reliability Scoring

arXiv:2606.06871v1 Announce Type: new Abstract: Diagnosing 802.11 packet captures requires expert protocol knowledge, is slow, inconsistent across engineers, and unscalable. LLM-based approaches sound plausible but fabricate protocol events absent from captures (especially truncated traces), produce uncalibrated confidence scores, and suffer evaluation bias when golden references are co-produced by the model under test. We introduce PROBE (Protocol Reasoning Over evidence-Based Ensembles), a...

arXiv CS 2d ago

Towards Efficient and Evidence-grounded Mobility Prediction with LLM-Driven Agent

arXiv:2606.05130v1 Announce Type: new Abstract: Individual-level mobility prediction is central to urban simulation, transportation planning, and policy analysis. Supervised sequence models achieve strong accuracy but require task-specific training and offer limited decision-level transparency. Recent LLM-based methods improve interpretability, yet mostly rely on static prompts and single-pass inference, limiting their ability to seek additional evidence when mobility signals are weak or...

arXiv CS 6d ago

Hierarchical Online Prompt Mutation with Dual-Loop Feedback for Guardrailed Evidence Document Generation: A Production-Evaluation Case Study

arXiv:2606.01472v1 Announce Type: new Abstract: High-stakes production document-generation systems require language models to be adaptive, evidence-grounded, and auditable. We present HOPM, a hierarchical online prompt mutation framework evaluated on a real marketplace dispute-evidence workflow. HOPM treats prompts as online policies: a family/version router selects a prompt, deterministic guardrails attribute failures to mutable prompt-token categories, and dual feedback from human review...

arXiv CS 8d ago

Debugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents

arXiv:2605.08717v2 Announce Type: replace Abstract: Software engineering agents are increasingly deployed in evaluable engineering environments, yet post-failure recovery remains costly, manual, and ad hoc. Existing systems expose traces or generate follow-up feedback, but they do not convert heterogeneous runtime evidence into grounded, bounded recovery guidance for a subsequent attempt. We present PROBE, a failure-anchored framework for structured recovery in software engineering agents.

arXiv CS 2d ago

OARelatedWork: A Large-Scale Dataset of Related Work Sections with Full-texts from Open Access Sources

arXiv:2405.01930v2 Announce Type: replace Abstract: This paper introduces OARelatedWork: a dataset for related work generation from open-access sources. It is the first large-scale multi-document summarization dataset for related work generation, containing whole related work sections and full texts of cited papers. Its validation and test splits are constructed so that every cited paper is available in full text, enabling controlled evaluation of full-text related work generation.

arXiv CS 8d ago