Home Knowledge Base the Gold Standard

the Gold Standard

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

When the Gold Standard Isn't Necessarily Standard: Challenges of Evaluating the Translation of User-Generated Content

arXiv:2512.17738v3 Announce Type: replace Abstract: User-generated content (UGC) is characterised by frequent use of non-standard language, from spelling errors to expressive choices such as slang, character repetitions, and emojis. This makes evaluating UGC translation challenging: what counts as a "good" translation depends on the desired standardness level of the output. To explore this, we examine the human translation guidelines of four UGC datasets, and derive a taxonomy of twelve...

arXiv CS 8d ago

Trump's gold standard science is harming US science and health

On 29 May 2026, the US White House's Office of Management and Budget (OMB) proposed a rule that directs political appointees to require adherence with “gold standard science” in the awarding of federal grants, including research grants funded by the National Institutes of Health (NIH) and the National Science Foundation.1 This process, if finalised, would permit political appointees - officials placed in senior leadership or policy roles through a political appointment process, such as the...

BMJ (British Medical Journal) 10h ago

Reasoning without Gold Standards: A Proxy-Judge Theory of Autoformalization

Announce Type: new Abstract: Complex reasoning tasks increasingly require systems to produce outputs whose correctness cannot be judged by exact match against a single reference. Autoformalization (AF) is a representative example; it asks a model to translate informal mathematical or logical reasoning into a formally checkable object, yet expert-validated formalizations do not scale beyond toy cases and a single informal argument can admit many valid formal renderings. Progress therefore...

arXiv CS 1d ago

Illusions of the Gold Standard: A Large-scale Analysis of Human Evaluation Protocols for Long-form Text Generation

arXiv:2606.07936v1 Announce Type: new Abstract: Human evaluation plays a critical role in assessing the quality of generated text. However, the reliability and reproducibility of these evaluations depend on transparent and well-documented protocols -- details that are frequently missing in current practice. In this work, we conduct a large-scale analysis of human evaluation protocols for evaluating long-form generation tasks in *CL conference publications from 2023--2025, including a full...

arXiv CS 1d ago

HiPS: Hierarchical PDF Segmentation of Doctrinal Legal Books

Announce Type: replace Abstract: PDF parsers have recently improved on page-level layout understanding. However, recovering a document-global section hierarchy with reliable boundaries remains brittle for deeply structured books: many systems expose only page-local heading roles, assume shallow depth, or rely on high-quality PDF tags or Table of Contents (TOC) metadata, and public gold-standard data for deep book hierarchies is scarce. We present HiPS for hierarchical PDF segmentation of...

arXiv CS 2d ago

New study casts doubt on reliability of mental health diagnosis interviews

Diagnostic interviews seen as ‘gold standard’ vary in reliability from condition to condition, study saysDiagnostic interviews – the most common way to diagnose substance use and mental disorders including depression, anxiety, bipolar and personality disorders – vary in reliability from condition to condition, according to a new study in Jama Network Open. Laura Duncan, a psychiatry professor at McMaster University in Ontario, Canada, and one of the study’s authors, said diagnostic...

The Guardian Health 4d ago

Lindsey Graham says Trump is ‘not far behind God’ after he survived primary challenge

Lindsey Graham says Trump is ‘not far behind God’ after he survived primary challenge ‘You’re the gold standard in the Republican world, the most consequential endorsement, I think, in the history of politics,’ Graham told Trump - Bookmark - CommentsGo to comments Fresh off his primary win, Lindsey Graham lavished praise on Donald Trump, calling him “not far behind God” and casting him as the Republican Party’s undisputed kingmaker. The four-term South Carolina senator made the remarks...

The Independent World 28m ago

New study casts doubt on reliability of mental health diagnosis interviews

Diagnostic interviews seen as ‘gold standard’ vary in reliability from condition to condition, study saysDiagnostic interviews – the most common way to diagnose substance use and mental disorders including depression, anxiety, bipolar and personality disorders – vary in reliability from condition to condition, according to a new study in Jama Network Open. Laura Duncan, a psychiatry professor at McMaster University in Ontario, Canada, and one of the study’s authors, said diagnostic...

The Guardian UK 4d ago

Broadcom tumbles as revenue miss clouds AI boom bets

Broadcom tumbles as revenue miss clouds AI boom bets June 4 : Broadcom shares sank about 12 per cent in premarket trading on Thursday, a day after the company missed quarterly revenue views and disappointed investors' lofty expectations of stronger momentum from the AI boom. The chipmaker could lose more than $285 billion in market cap at the current price of $418.83, if losses hold. Broadcom vies with Nvidia, whose graphics processors remain the gold standard for AI workloads, underscoring...

Channel News Asia 6d ago

Digging Up Citations: FOSSIL, a Dataset and Workflow for Reference Extraction in Law and the Humanities

arXiv:2606.01109v1 Announce Type: new Abstract: Citation extraction tools are designed for the structured end-of-document bibliographies of the natural sciences, but law and humanities scholarship cites references primarily in footnotes, where bibliographic data is interleaved with commentary and cross-references and varies widely across languages and styles. To address the scarcity of suitable gold-standard resources, we present FOSSIL (Footnote-based Open-access SSH Scientific Instance...

arXiv CS 8d ago