Scalable Clinical Oversight of Large Language Models Via Uncertainty Triangulation
SCOUT
Prospective Evaluation of a Model-Agnostic Meta-Verification Framework (SCOUT) for Scalable Clinical Oversight of Large Language Model Outputs in Coronary Heart Disease Diagnosis: A Multi-Reader, Randomized, Crossover Trial
1 other identifier
interventional
7
0 countries
N/A
Brief Summary
This prospective, multi-reader, randomized crossover trial evaluates SCOUT (Scalable Clinical Oversight via Uncertainty Triangulation), a model-agnostic meta-verification framework that selectively defers unreliable large language model (LLM) predictions to clinicians by triangulating three orthogonal uncertainty signals: model heterogeneity, stochastic inconsistency, and reasoning critique. The trial assesses whether SCOUT-assisted review can reduce physician review time compared with standard manual review of AI-generated diagnoses while maintaining non-inferior diagnostic accuracy in coronary heart disease (CHD) subtyping.
Trial Health
Trial Health Score
Automated assessment based on enrollment pace, timeline, and geographic reach
participants targeted
Target at below P25 for not_applicable
Started Feb 2026
Health score is calculated from publicly available data and should be used for screening purposes only.
Trial Relationships
Click on a node to explore related trials.
Study Timeline
Key milestones and dates
First Submitted
Initial submission to the registry
February 9, 2026
CompletedFirst Posted
Study publicly available on registry
February 17, 2026
CompletedStudy Start
First participant enrolled
February 19, 2026
CompletedPrimary Completion
Last participant's last visit for primary outcome
February 28, 2026
CompletedStudy Completion
Last participant's last visit for all outcomes
February 28, 2026
CompletedFebruary 17, 2026
February 1, 2026
9 days
February 9, 2026
February 14, 2026
Conditions
Keywords
Outcome Measures
Primary Outcomes (1)
Mean physician review time per case (minutes)
Mean time spent by each clinician reviewing and rendering a diagnostic decision per case under each arm. Measured in minutes.
Through study completion, an average of 2 hours.
Secondary Outcomes (2)
Diagnostic accuracy (%)
Through study completion, an average of 2 hours.
Computational Return on Investment (ROI)
Through study completion, an average of 2 hours.
Study Arms (2)
Control (Standard Manual Review)
ACTIVE COMPARATORPhysicians manually review all cases in the control set (n=54) with access to AI predictions and reasoning. No selective deferral.
Experimental (SCOUT-Assisted Review)
EXPERIMENTALPhysicians process the intervention set (n=56) through the SCOUT framework. Low-uncertainty cases are auto-accepted; high-uncertainty cases undergo physician review with full audit trail.
Interventions
SCOUT-Assisted Review (Intervention Arm): Physicians review 56 cases processed through the SCOUT framework. For cases classified as low-uncertainty (D(x)=0), the AI prediction is auto-accepted without physician review. For high-uncertainty cases (D(x)=1), the physician reviews the case with access to the main model's chain-of-thought reasoning and the meta-verification audit results. The main model is DeepSeek-V3.1 with chain-of-thought prompting.
Physicians perform a full manual review of 54 cases using raw medical records with access to the AI model's predictions and reasoning, but without SCOUT uncertainty stratification or selective deferral.
Eligibility Criteria
You may qualify if:
- Board-certified or in-training cardiologists at Fuwai Hospital
- Spanning three experience strata: junior residents, senior residents, attending physicians
You may not qualify if:
- Clinicians involved in the development or optimization of the SCOUT framework
- Clinicians involved in the gold-standard adjudication process
Contact the study team to confirm eligibility.
Sponsors & Collaborators
MeSH Terms
Conditions
Condition Hierarchy (Ancestors)
Central Study Contacts
Study Design
- Study Type
- interventional
- Phase
- not applicable
- Allocation
- RANDOMIZED
- Masking
- NONE
- Purpose
- DIAGNOSTIC
- Intervention Model
- CROSSOVER
- Sponsor Type
- OTHER GOV
- Responsible Party
- SPONSOR
Study Record Dates
First Submitted
February 9, 2026
First Posted
February 17, 2026
Study Start
February 19, 2026
Primary Completion
February 28, 2026
Study Completion
February 28, 2026
Last Updated
February 17, 2026
Record last verified: 2026-02
Data Sharing
- IPD Sharing
- Will share
- Shared Documents
- STUDY PROTOCOL, SAP, ICF, CSR, ANALYTIC CODE
- Time Frame
- Beginning 1 months after publication of the primary results and available for up to 60 months.
- Access Criteria
- Data are available from the corresponding author upon reasonable request. Requestors will need to provide a methodologically sound research proposal and sign a data use agreement.
De-identified individual participant data underlying the results reported in this study will be made available.