Home Knowledge Base the Causal Sensitivity Score

the Causal Sensitivity Score

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents

Announce Type: new Abstract: Two clinical AI systems can score nearly identically on coverage-based rubrics yet behave radically differently when their patient inputs change: one updates its recommendations to match the new clinical signal, while the other produces the same output regardless. We introduce the Causal Sensitivity Score (CSS), a pre-registered interventional metric that mutates oncology tumor-board cases along five clinically meaningful dimensions - biomarker flips,...

arXiv CS 9d ago

EgoAction: Egocentric Action Composition with Reliability-Aware Temporal Fusion for the EPIC-KITCHENS Action Detection Challenge at CVPR 2026

Announce Type: replace Abstract: The EPIC-KITCHENS-100 Action Detection challenge evaluates whether a model can localize the start and end of each action in long untrimmed egocentric videos and assign the corresponding verb--noun action label. In this report, we formulate our submission as EgoAction (Egocentric Action Composition with Reliability-Aware Temporal Fusion), a unified decoupled detection and fusion pipeline. The pipeline uses EPIC-finetuned VideoMAE-L features, trains separate...

arXiv CS 5d ago

Did Claude increase bugs in rsync?

A simple distributional analysis of every rsync release with bug data. Nothing complicated, answers only one question: are the Claude-assisted releases unusually buggy? In order to avoid accuastions of this "just being Claude defending Claude," "AI slop," "probably all hallucinations," etc., I've decided it's probably worth explaining a few key points about how this report was created: In late May 2026, rsync blew up.

Hacker News 5d ago

DXA-Derived Skeletal Phenotypes and Hip Fracture Risk: A Backdoor-Adjusted Causal Analysis

Announce Type: cross Abstract: Purpose: To compare dual-energy X-ray absorptiometry (DXA)-derived hip skeletal phenotypes in relation to hip fracture risk using prespecified confounder adjustment and to assess whether phenotypes ranked by their backdoor-adjusted average treatment effects (ATEs) improve risk stratification. Methods: We analyzed 21,098 UK Biobank participants with linked health records, hip DXA-derived skeletal measures, and prespecified covariates. Sixteen phenotypes spanning...

arXiv CS 7d ago

Deep learning four decades of human migration

Abstract Human migration is a fundamental driver of global demographic change, shaping population structure, labour markets and social policy across countries1,2,3. Although long-term migration patterns are often linked to economic development4, they can shift rapidly in response to shocks such as conflict, environmental crises and political change5. Despite its importance, migration remains difficult to measure consistently: existing data are sparse, concentrated in high-income settings and...

Nature 20h ago